Average programmers struggle to solve performance problems in OpenMP programs with tasks and parallel for-loops. Existing performance analysis tools visualize OpenMP task performance from the runtime system’s perspective where task execution is interleaved with other tasks in an unpredictable order. Problems with OpenMP parallel for-loops are similarly difficult to resolve since tools only visualize aggregate thread-level statistics such as load imbalance without zooming into a per-chunk granularity. The runtime system/threads oriented visualization provides poor support for understanding problems with task and chunk execution time, parallelism, and memory hierarchy utilization, forcing average programmers to rely on experts or use tedious trial-and-error tuning methods for performance. We present grain graphs, a new OpenMP performance analysis method that visualizes grains – computation performed by a task or a parallel for-loop chunk instance – and highlights problems such as low parallelism, work inflation and poor parallelization benefit at the grain level. We demonstrate that grain graphs can quickly reveal performance problems that are difficult to detect and characterize in fine detail using existing visualizations in standard OpenMP programs, simplifying OpenMP performance analysis. This enables average programmers to make portable optimizations for poor performing OpenMP programs, reducing pressure on experts and removing the need for tedious trial-and-error tuning.
Wed 16 MarDisplayed time zone: Belfast change
11:35 - 12:50 | Performance analysis and debuggingMain conference at Mallorca+Menorca Chair(s): Martin Schulz Lawrence Livermore National Laboratory | ||
11:35 25mTalk | ESTIMA: Extrapolating ScalabiliTy of In-Memory Applications Main conference Georgios Chatzopoulos Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland, Aleksandar Dragojević Microsoft Research, Rachid Guerraoui EPFL, Switzerland Link to publication DOI | ||
12:00 25mTalk | Grain Graphs: OpenMP Performance Analysis Made Easy Main conference Ananya Muddukrishna , Peter A. Jonsson SICS Swedish ICT AB, Artur Podobas KTH Royal Institute of Technology, Mats Brorsson KTH Royal Institute of Technology Link to publication DOI | ||
12:25 25mTalk | Production-guided Concurrency Debugging Main conference Nuno Machado INESC-ID / Instituto Superior Técnico, Universidade de Lisboa, Brandon Lucia Carnegie Mellon University, Luís Rodrigues Universidade de Lisboa, Instituto Superior Técnico, INESC-ID Link to publication DOI |