To achieve good multi-core performance, modern microprocessors have weak memory models, rather than enforce sequential consistency. This gives the programmer a wide scope for choosing exactly how to implement various aspects of inter-thread communication through the system’s shared memory. However, these choices come with both semantic and performance consequences, often in tension with each other. In this paper, we focus on the performance side, and define techniques for evaluating the impact of various choices in using weak memory models, such as where to put fences, and which fences to use. We make no attempt to judge certain strategies as best or most efficient, and instead provide the techniques that will allow the programmer to understand the performance implications when identifying and resolving any semantic/performance trade-offs. In particular, our technique supports the reasoned selection of macrobenchmarks to use in investigating trade-offs in using weak memory models. We demonstrate our technique on both synthetic benchmarks and real-world applications for the Linux Kernel and OpenJDK Hotspot Virtual Machine on the ARMv8 and POWERv7 architectures.
Wed 16 Mar (GMT+01:00) Greenwich Mean Time : Belfast change
|10:00 - 10:25|
|Link to publication DOI|
|10:25 - 10:50|
Ganesh NarayanswamyDepartment of Computer Science, University of Oxford, Saurabh JoshiDepartment of Computer Science and Engineering, IIT Guwahati, Daniel KroeningUniversity of OxfordLink to publication DOI
|10:50 - 11:15|
Matthieu PerrinUniversity of Nantes, Achour MostefaouiUniversity of Nantes, Claude JardUniversity of NantesLink to publication DOI