Contention-Aware Scheduling of Parallel Code for Heterogeneous SystemsReport
A typical consumer desktop computer has a multi-core CPU with at least two and possibly up to eight processing ele- ments over two processors, and a multi-core GPU with up to 512 processing elements. Both the CPU and the GPU are ca- pable of running parallel code, yet it is not obvious when to utilize one processor or the other because of workload con- siderations and, as importantly, contention on each device. This paper demonstrates a method for dynamically deciding whether to run a given parallel workload on the CPU or the GPU depending on the state of the system when the code is aunched. To achieve this, we tested a selection of parallel penCL code on a multi-core CPU and a multi-core GPU, as part of a larger program that runs on the CPU. When the parallel code is launched, the runtime makes a dynamic deci- sion about which processor to run the code on, given system state and historical data. We demonstrate a method for using meta-data available to the runtime and historical data from code profiling to make the dynamic decision, and we out- line the runtime information necessary for making effective dynamic decisions, suggest hardware, operating system, and driver support.
All rights reserved (no additional license for public reuse)
Gregg, Chris, Jeff Brantley, and Kim Hazelwood. "Contention-Aware Scheduling of Parallel Code for Heterogeneous Systems." University of Virginia Dept. of Computer Science Tech Report (2010).
University of Virginia, Department of Computer Science