Jiffy WF LCRQ CC MS
Absolute Absolute Relative Absolute Relative Absolute Relative Absolute Relative
Total Heap Usage 38.70 MB 611.63 MB x15.80 610.95 MB x15.78 305.18 MB x7.88 1.192 GB x31.54
Number of Allocs 3,095 9,793 x3.16 1,230 x0.40 5,000,015 x1,615 5,000,010 x1,615
Peak Heap Size 44.81 MB 200.8 MB x4.48 612.6 MB x13.67 420.0 MB x9.37 1.229 GB x28.08
# of Instructions Executed 550,416,453 5,612,941,764 x10.20 1,630,746,827 x2.96 3,500,543,753 x6.36 1,821,777,428 x3.31
I1 Misses 2,162 1,714 x0.79 1,601 x0.74 1,577 x0.73 1,636 x0.76
L3i Misses 2,084 1,707 x0.82 1,591 x0.76 1,572 x0.75 1,630 x0.78
Data Cache Tries (R+W) 281,852,749 2,075,332,377 x7.36 650,257,906 x2.3 1,238,761,631 x4.40 646,304,575 x2.29
D1 Misses 1,320,401 25,586,262 x19.37 20,037,956 x15.17 15,000,507 x11.36 11,605,064 x8.79
L3d Misses 652,194 10,148,090 x15.56 5,028,204 x7.7 14,971,182 x22.96 10,973,055 x16.82
Table 1: Valgrind memory usage statistics run with one enqueuer and one dequeuer. I1/D1 is the L1
instruction/data cache respectively while L3i/L3d is the L3 instruction/data cache respectively.
The only thing to worry about is the legality of dequeue operations that returned empty. For this,
we can apply the same arguments as in Claim 5.5 and Sub-claim 5.6. In summary, σ
0
is an equivalent
legal sequential execution to σ that preserves σ’s real-time order.
5.2 Wait-Freedom
We show that each invocation of enqueue and dequeue returns in a finite number of steps.
Lemma 5.8. Each enqueue operation in Algorithm 2 completes in a finite number of steps.
Proof. An enqueue operation, as listed in Algorithm 2, consists of two while loops (line 3 and line 10),
each involving a finite number of operations, a single FAA operation at the beginning (line 2), and then
a short finite sequence of operations from line 15 onward. Hence, we only need to show that the two
while loops terminate in a finite number of iterations.
The goal of the first while loop is to ensure that the enqueuer only accesses an allocated buffer. That
is, in each iteration, if the location index it obtained in line 2 is beyond the last allocated buffer, the
enqueuer allocates a new buffer and tries to add it with a CAS to the end of the queue (line 3). Even if
there are concurrent such attempts by multiple enqueuers, in each iteration at least one of them succeeds,
so this loop terminates in a finite number of steps at each such enqueuer.
The next step for the enqueuer is to obtain the buffer corresponding to location. As mentioned before,
it is possible that by this time the queue has grown due to concurrent faster enqueue operations. Hence,
the enqueuer starts scanning from tailOfQueue backwards until reaching the correct buffer (line 10).
Since new buffers are only added at the end of the queue, this while loop also terminates in a finite
number of steps regardless of concurrency.
Lemma 5.9. Each dequeue operation in Algorithm 3 completes in a finite number of steps.
Proof. We prove the lemma by analyzing Algorithm 3. Consider the while loop at line 4. Here, we
advance head to point to the first non-handled element, in case it is not already pointing to one. As
indicated before, the latter could occur if a previous dequeue deq
1
removed an element not pointed by
head. As we perform this scan only once and the queue is finite, the loop terminates within a finite
number of iterations, each consisting of at most a constant number of operations.
If at this point the queue is identified as empty, which is detectable by comparing head and tail
pointers and indices, then the operation returns immediately (line 10). The following for loop is at
line 15. Again, since the queue is finite, the next set element is within a finite number of entries away.
Hence, we terminate the for loop after a finite number of iterations.
The last for loop iteration at line 26 scans the queue from its head to the first set element, which
as mentioned before, is a finite number of entries away. Yet, the for loop could be restarted in line 28,
so we need to show that such a restart can occur only a finite number of times. This is true because
each time we restart, we shorten the “distance” that the for loop at line 26 needs to cover, and as just
mentioned, this “distance” is finite to begin with.
The rest of the code from line 32 onward is a short list of simple instructions. Hence, each dequeue
operation terminates in a finite number of steps.
Theorem 5.10. The Jiffy queue implementation is wait-free.
11