Blogs (28) >>
ICFP 2017
Sun 3 - Sat 9 September 2017 Oxford, United Kingdom
Thu 7 Sep 2017 14:30 - 15:00 at L4 - Parallel Programming Chair(s): Geoffrey Mainland

We present and evaluate an implementation technique for regular segmented reductions on GPUs. Existing techniques tend to be either consistent in performance but relatively inefficient in absolute terms, or optimised for specific workloads and thereby exhibiting bad performance for certain input. We propose three different strategies for segmented reduction of regular arrays, each optimised for a particular workload. We demonstrate an implementation in the Futhark compiler that is able to employ all three strategies and automatically select the appropriate one at runtime. While our evaluation is in the context of the Futhark compiler, the implementation technique is applicable to any library or language that has a need for segmented reductions.

We evaluate the technique on four microbenchmarks, two of which we also compare to implementations in the CUB library for GPU programming, as well as on two application benchmarks from the Rodinia suite. On the latter, we obtain speedups ranging from 1.3x to 1.7x over a previous implementation based on scans.

Thu 7 Sep

Displayed time zone: Belfast change

14:00 - 15:00
Parallel ProgrammingFHPC at L4
Chair(s): Geoffrey Mainland Drexel University, USA
14:00
30m
Talk
In Search of a Map: using Program Slicing to Discover Potential Parallelism in Recursive Functions
FHPC
A: Adam Barwell , A: Kevin Hammond University of St. Andrews, UK
14:30
30m
Talk
Strategies for Regular Segmented Reductions on GPU
FHPC
A: Rasmus Wriedt Larsen DIKU, University of Copenhagen, A: Troels Henriksen DIKU, University of Copenhagen