High Performance and Scalable GPU Graph Traversal

Report
Authors:Merrill, Duane, Department of Computer ScienceUniversity of Virginia Garland, Michael, Department of Computer ScienceUniversity of Virginia Grimshaw, Andrew, Department of Computer ScienceUniversity of Virginia
Abstract:

Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to focus on asymptotically inefficient algorithms that perform poorly on graphs with non-trivial diameter. We present a BFS parallelization focused on fine-grained task management that achieves an asymptotically optimal O(|V|+|E|) work complexity. Our implementation delivers excellent performance on diverse graphs, achieving traversal rates in excess of 3.3 billion and 8.3 billion traversed edges per second using single and quad-GPU configurations, respectively. This level of performance is several times faster than state-of-the-art implementations both CPU and GPU platforms.
Note: Abstract extracted from PDF text

Rights:
All rights reserved (no additional license for public reuse)
Language:
English
Source Citation:

Merrill, Duane, Michael Garland, and Andrew Grimshaw. "High Performance and Scalable GPU Graph Traversal." University of Virginia Dept. of Computer Science Tech Report (2011).

Publisher:
University of Virginia, Department of Computer Science
Published Date:
2011