PPoPP 2016
Sat 12 - Wed 16 March 2016 Barcelona, Spain
Mon 14 Mar 2016 17:35 - 18:00 at Mallorca+Menorca - GPUs and Scheduling Chair(s): Christophe Dubach

Interactive web services increasingly drive critical business workloads such as search, advertising, games, shopping, and finance. Whereas optimizing parallel programs and distributed server systems have historically focused on average latency and throughput, the primary metric for interactive applications is instead consistent responsiveness, i.e., minimizing the number of requests that miss a target latency. This paper is the first to show how to generalize work-stealing, which is traditionally used to minimize the makespan of a single parallel job, to optimize for a target latency in interactive services with multiple parallel requests.

We design a new adaptive work stealing policy, called tail-control, that reduces the number of requests that miss a target latency. It uses instantaneous request progress, system load, and a target latency to choose when to parallelize requests with stealing, when to admit new requests, and when to limit parallelism of large requests. We implement this approach in the Intel Thread Building Block (TBB) library and evaluate it on real-world workloads and synthetic workloads. The tail-control policy substantially reduces the number of requests exceeding the desired target latency and delivers up to 58% relative improvement over various baseline policies. This generalization of work stealing for multiple requests effectively optimizes the number of requests that complete within a target latency, a key metric for interactive services.

Mon 14 Mar

Displayed time zone: Belfast change

16:20 - 18:00
GPUs and SchedulingMain conference at Mallorca+Menorca
Chair(s): Christophe Dubach University of Edinburgh
16:20
25m
Talk
Gunrock: A High-Performance Graph Processing Library on the GPUDistinguished Paper AwardArtifact Evaluation
Main conference
Yangzihao Wang , Andrew Davidson University of California, Davis, Yuechao Pan University of California, Davis, Yuduo Wu University of California, Davis, Andy Riffel University of California, Davis, John D. Owens University of California, Davis
Link to publication DOI
16:45
25m
Talk
GPU Multisplit
Main conference
Saman Ashkiani University of California, Davis, Andrew Davidson University of California, Davis, Ulrich Meyer Goethe-Universitat Frankfurt am Main, John D. Owens University of California, Davis
Link to publication DOI
17:10
25m
Talk
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Efficient Elastic Data Stream Processing Artifact Evaluation
Main conference
Tiziano De Matteis University of Pisa, Gabriele Mencagli University of Pisa
Link to publication DOI
17:35
25m
Talk
Work Stealing for Interactive Services to Meet Target Latency
Main conference
Jing Li Washington University in St. Louis, Kunal Agrawal Washington University in St. Louis, Sameh Elnikety Microsoft Research, Yuxiong He Microsoft Research, I-Ting Angelina Lee Washington University in St. Louis, Chenyang Lu Washington University in St. Louis, Kathryn S McKinley Microsoft Research
Link to publication DOI