PPoPP 2016
Sat 12 - Wed 16 March 2016 Barcelona, Spain

Barcelona, Spain, March 12-16 2016

Updated: Feedback Survey

PPoPP 2016 is the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

PPoPP is the forum for leading work on all aspects of parallel programming, including foundational and theoretical aspects, techniques, languages, compilers, runtime systems, tools, and practical experiences. In the context of the symposium, “parallel programming” encompasses work on concurrent and parallel systems (multicore, multithreaded, heterogeneous, clustered systems, distributed systems, grids, clouds, and large scale machines). Given the rise of parallel architectures into the consumer market (desktops, laptops, and mobile devices), PPoPP is particularly interested in work that addresses new parallel workloads, techniques, and tools that attempt to improve the productivity of parallel programming, and work towards improved synergy with such emerging architectures.

PPoPP 2016 will be held in Barcelona, Spain, on March 12-16 2016. PPoPP 2016 will be co-located with HPCA 2016, CGO 2016, CC 2016, LLVM 2016. RoMoL 2016 is also taking place on March 17-18.


To be printed two-sided flip-short-edge and folded.

Main Conference Workshops and Tutorials Useful Information
[PPoPP Leaflet] [PPoPP Leaflet] [PPoPP Leaflet]
[Nacho Navarro] The PPoPP community is mourning the early departure of one its most distinguished members, Prof. Nacho Navarro from UPC Barcelona, who passed away suddenly on February 28th at the age of 58. He will be greatly missed and remembered as a very warm and joyful friend.

Free access to the conference papers maintained in the ACM DL for a one month period from the conference start date.

SESSION 1: Monday 10:00 - 11:15

Coarse grain parallelization of deep neural networks
  • Marc Gonzalez Tallada
High performance model based image reconstruction
  • Xiao Wang, Amit Sabne, Sherman Kisner, Anand Raghunathan, Charles Bouman, Samuel Midkiff
Exploiting accelerators for efficient high dimensional similarity search
  • Sandeep R. Agrawal, Christopher M. Dee, Alvin R. Lebeck

SESSION 2: Monday 11:35 - 12:50

Language implementation and domain specific languages.
[Accepted Artifacts]Declarative coordination of graph-based parallel programs
  • Flavio Cruz, Ricardo Rocha, Seth Copen Goldstein
Distributed Halide
  • Tyler Denniston, Shoaib Kamil, Saman Amarasinghe
Parallel type-checking with haskell using saturating LVars and stream generators
  • Ryan R. Newton, Ömer S. Ağacan, Peter Fogg, Sam Tobin-Hochstadt

SESSION 3: Monday 14:20 - 16:00

Articulation points guided redundancy elimination for betweenness centrality
  • Lei Wang, Fan Yang, Liangji Zhuang, Huimin Cui, Fang Lv, Xiaobing Feng
[Accepted Artifacts]Multi-core on-the-fly SCC decomposition
  • Vincent Bloemen, Alfons Laarman, Jaco van de Pol
A high-performance parallel algorithm for nonnegative matrix factorization
  • Ramakrishnan Kannan, Grey Ballard, Haesun Park
[Accepted Artifacts]AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs
  • Rezaul Chowdhury, Pramod Ganapathi, Jesmin Jahan Tithi, Charles Bachmeier, Bradley C. Kuszmaul, Charles E. Leiserson, Armando Solar-Lezama, Yuan Tang

SESSION 4: Monday 16:20 - 18:00

GPUs and scheduling.
[Accepted Artifacts][Distinguished paper]Gunrock: a high-performance graph processing library on the GPU
  • Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens
GPU multisplit
  • Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens
[Accepted Artifacts]Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing
  • Tiziano De Matteis, Gabriele Mencagli
Work stealing for interactive services to meet target latency
  • Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He, I-Ting Angelina Lee, Chenyang Lu, Kathryn S. McKinley

SESSION 5: Tuesday 10:00 - 11:15

Shared-memory data structures.
Adding approximate counters
  • Guy L. Steele, Jr., Jean-Baptiste Tristan
[Accepted Artifacts]A wait-free queue as fast as fetch-and-add
  • Chaoran Yang, John Mellor-Crummey
Lease/release: architectural support for scaling contended data structures
  • Syed Kamran Haider, William Hasenplaugh, Dan Alistarh

SESSION 6: Tuesday 11:35 - 12:50

Optimistic concurrency.
[Accepted Artifacts]Optimistic concurrency with OPTIK
  • Rachid Guerraoui, Vasileios Trigonakis
Refined transactional lock elision
  • Dave Dice, Alex Kogan, Yossi Lev
[Accepted Artifacts]Drinking from both glasses: combining pessimistic and optimistic tracking of cross-thread dependences
  • Man Cao, Minjia Zhang, Aritra Sengupta, Michael D. Bond

SESSION 7: Tuesday 14:20 - 15:35

[Accepted Artifacts]Be my guest: MCS lock now welcomes guests
  • Tianzheng Wang, Milind Chabbi, Hideaki Kimura
Contention-conscious, locality-preserving locks
  • Milind Chabbi, John Mellor-Crummey
[Accepted Artifacts][Distinguished paper]DomLock: a new multi-granularity locking technique for hierarchies
  • Saurabh Kalikar, Rupesh Nasre

SESSION 8: Wednesday 10:00 - 11:15

Consistency models.
Benchmarking weak memory models
  • Carl G. Ritson, Scott Owens
[Accepted Artifacts]The virtues of conflict: analysing modern concurrency
  • Ganesh Narayanaswamy, Saurabh Joshi, Daniel Kroening
Causal consistency: beyond memory
  • Matthieu Perrin, Achour Mostefaoui, Claude Jard

SESSION 9: Wednesday 11:35 - 12:50

Performance analysis and debugging.
ESTIMA: extrapolating scalability of in-memory applications
  • Georgios Chatzopoulos, Aleksandar Dragojević, Rachid Guerraoui
Grain graphs: OpenMP performance analysis made easy
  • Ananya Muddukrishna, Peter A. Jonsson, Artur Podobas, Mats Brorsson
[Accepted Artifacts]Production-guided concurrency debugging
  • Nuno Machado, Brandon Lucia, Luís Rodrigues

POSTER SESSION: Poster abstracts

Affinity-aware work-stealing for integrated CPU-GPU processors
  • Naila Farooqui, Rajkishore Barik, Brian T. Lewis, Tatiana Shpeisman, Karsten Schwan
An interval constrained memory allocator for the Givy GAS runtime
  • François Gindraud, Fabrice Rastello, Albert Cohen, François Broquedis
A programming system for future proofing performance critical libraries
  • Li-Wen Chang, Izzat El Hajj, Hee-Seok Kim, Juan Gómez-Luna, Abdul Dakkak, Wen-mei Hwu
A scalable lock-free hash table with open addressing
  • Jesper Puge Nielsen, Sven Karlsson
Concurrent hash tables: fast and general?(!)
  • Tobias Maier, Peter Sanders, Roman Dementiev
CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA
  • Javier Prades, Carlos Reaño, Federico Silla
Effect of portable fine-grained locality on energy efficiency and performance in concurrent search trees
  • Ibrahim Umar, Otto J. Anshus, Phuong H. Ha
Efficient distributed workstealing via matchmaking
  • Hrushit Parikh, Vinit Deodhar, Ada Gavrilovska, Santosh Pande
Data-centric combinatorial optimization of parallel code
  • Hao Luo, Guoyang Chen, Pengcheng Li, Chen Ding, Xipeng Shen
DSMR: a shared and distributed memory algorithm for single-source shortest path problem
  • Saeed Maleki, Donald Nguyen, Andrew Lenharth, María Garzarán, David Padua, Keshav Pingali
Generic messages: capability-based shared memory parallelism for event-loop systems
  • Luca Salucci, Daniele Bonetta, Stefan Marr, Walter Binder
Hybrid CPU-GPU scheduling and execution of tree traversals
  • Jianqiao Liu, Nikhil Hegde, Milind Kulkarni
Improving efficacy of internal binary search trees using local recovery
  • Arunmoezhi Ramachandran, Neeraj Mittal
Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format
  • Duane Merrill, Michael Garland
NUMA-aware scheduling and memory allocation for data-flow task-parallel applications
  • Andi Drebes, Antoniu Pop, Karine Heydemann, Nathalie Drach, Albert Cohen
On designing NUMA-aware concurrency control for scalable transactional memory
  • Mohamed Mohamedin, Roberto Palmieri, Sebastiano Peluso, Binoy Ravindran
On ordering transaction commit
  • Mohamed M. Saad, Roberto Palmieri, Binoy Ravindran
OPR: deterministic group replay for one-sided communication
  • Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu
Preemption-aware planning on big-data systems
  • Marco Rabozzi, Matteo Mazzucchelli, Roberto Cordone, Giovanni Matteo Fumarola, Marco D. Santambrogio
Samsara parallel: a non-BSP parallel-in-time model
  • Yifeng Chen, Kun Huang, Bei Wang, Guohui Li, Xiang Cui
Scalable adaptive NUMA-aware lock: combining local locking and remote locking for efficient concurrency
  • Mingzhe Zhang, Francis C. M. Lau, Cho-Li Wang, Luwei Cheng, Haibo Chen
SPIRIT: a runtime system for distributed irregular tree applications
  • Nikhil Hegde, Jianqiao Liu, Milind Kulkarni
Tidex: a mutual exclusion lock
  • Pedro Ramalhete, Andreia Correia
Unifying fixed code and fixed data mapping of load-imbalanced pipelined loops
  • Aristeidis Mastoras, Thomas R. Gross
User-assisted storage reuse determination for dynamic task graphs
  • Mehmet Can Kurt, Bin Ren, Sriram Krishnamoorthy, Gagan Agrawal
Verification of MPI Java programs using software model checking
  • Waqas Ur Rehman, Muhammad Sohaib Ayub, Junaid Haroon Siddiqui