Concurrent data structures that have fast and predictable performance are of critical importance for harnessing the power of multicore processors, which are now ubiquitous. Although wait-free objects, whose operations complete in a bounded number of steps, were devised more than two decades ago, wait-free objects that can deliver scalable high performance are still rare.
In this paper, we present the first wait-free FIFO queue based on fetch-and-add (FAA). While compare-and-swap (CAS) based non-blocking algorithms may perform poorly due to work wasted by CAS failures, algorithms that coordinate using FAA, which is guaranteed to succeed, can in principle perform better under high contention. Along with FAA, our queue uses a custom epoch-based scheme to reclaim memory; on x86 architectures, it requires no extra memory fences on our algorithm’s typical execution path. An empirical study of our new FAA-based wait-free FIFO queue under high contention on four different architectures with many hardware threads shows that it outperforms prior queue designs that lack a wait-free progress guarantee. Surprisingly, at the highest level of contention, the throughput of our queue is often as high as that of a microbenchmark that only performs FAA. As a result, our fast wait-free queue implementation is useful in practice on most multi-core systems today. We believe that our design can serve as an example of how to construct other fast wait-free objects.
Tue 15 MarDisplayed time zone: Belfast change
10:00 - 11:15 | |||
10:00 25mTalk | Adding Approximate Counters Main conference Link to publication DOI | ||
10:25 25mTalk | A Wait-Free Queue as Fast as Fetch-and-Add Main conference Link to publication DOI | ||
10:50 25mTalk | Lease/Release: Architectural Support for Scaling Contended Data Structures Main conference Syed Kamran Haider University of Connecticut, William Hasenplaugh MIT, Dan Alistarh Microsoft Research Link to publication DOI |