GPU Multisplit (PPoPP 2016 - Main conference)

Who

Saman Ashkiani, Andrew Davidson, Ulrich Meyer, John D. Owens

Track

PPoPP 2016 Main conference

Time Zone

The program is currently displayed in (GMT) Belfast.

Use conference time zone: (GMT) BelfastSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 14 Mar 2016 16:45 - 17:10 at Mallorca+Menorca - GPUs and Scheduling Chair(s): Christophe Dubach

Abstract

Multisplit is a broadly useful parallel primitive that permutes its input data into contiguous buckets or bins, where the function that categorizes an element into a bucket is provided by the programmer. Due to the lack of an efficient multisplit on GPUs, programmers often choose to implement multisplit with a sort. However, sort does more work than necessary to implement multisplit, and is thus inefficient. In this work, we provide a parallel model and multiple implementations for the multisplit problem. Our principal focus is multisplit for a small number of buckets. In our implementations, we exploit the computational hierarchy of the GPU to perform most of the work locally, with minimal usage of global operations. We also use warp-synchronous programming models to avoid branch divergence and reduce memory usage, as well as hierarchical reordering of input elements to achieve better coalescing of global memory accesses. On an NVIDIA K40c GPU, for key-only (key-value) multisplit, we demonstrate a 3.0-6.7x (4.4-8.0x) speedup over radix sort, and achieve a peak throughput of 10.0 G keys/s.

Link to Publication

http://dl.acm.org/citation.cfm?id=2851169&CFID=752402131&CFTOKEN=70128886

DOI

https://doi.org/10.1145/2851141.2851169

Saman Ashkiani

University of California, Davis

Andrew Davidson

University of California, Davis

Ulrich Meyer

Goethe-Universitat Frankfurt am Main

John D. Owens

University of California, Davis

Time Zone

The program is currently displayed in (GMT) Belfast.

Use conference time zone: (GMT) BelfastSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 14 Mar
Displayed time zone: Belfast change

16:20 - 18:00	GPUs and SchedulingMain conference at Mallorca+Menorca Chair(s): Christophe Dubach University of Edinburgh

16:20 25m Talk		Gunrock: A High-Performance Graph Processing Library on the GPU Main conference Yangzihao Wang , Andrew Davidson University of California, Davis, Yuechao Pan University of California, Davis, Yuduo Wu University of California, Davis, Andy Riffel University of California, Davis, John D. Owens University of California, Davis Link to publication DOI
16:45 25m Talk		GPU Multisplit Main conference Saman Ashkiani University of California, Davis, Andrew Davidson University of California, Davis, Ulrich Meyer Goethe-Universitat Frankfurt am Main, John D. Owens University of California, Davis Link to publication DOI
17:10 25m Talk		Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Efficient Elastic Data Stream Processing Main conference Tiziano De Matteis University of Pisa, Gabriele Mencagli University of Pisa Link to publication DOI
17:35 25m Talk		Work Stealing for Interactive Services to Meet Target Latency Main conference Jing Li Washington University in St. Louis, Kunal Agrawal Washington University in St. Louis, Sameh Elnikety Microsoft Research, Yuxiong He Microsoft Research, I-Ting Angelina Lee Washington University in St. Louis, Chenyang Lu Washington University in St. Louis, Kathryn S McKinley Microsoft Research Link to publication DOI

GPU Multisplit

Saman Ashkiani

University of California, Davis

Andrew Davidson

University of California, Davis

Ulrich Meyer

Goethe-Universitat Frankfurt am Main

John D. Owens

University of California, Davis

Mon 14 Mar
Displayed time zone: Belfast change

Tracks

Workshops

GPU Multisplit

Program Display Configuration

Saman Ashkiani

University of California, Davis

Andrew Davidson

University of California, Davis

Ulrich Meyer

Goethe-Universitat Frankfurt am Main

John D. Owens

University of California, Davis

Program Display Configuration

Mon 14 MarDisplayed time zone: Belfast change

Mon 14 Mar
Displayed time zone: Belfast change