ParaWeaver: Performance Evaluation on Programming Models for Fine Grained ThreadsReport
There is a trend towards multicore or manycore processors in com- puter architecture design. In addition, several parallel program- ming models have been introduced. Some extract concurrent threads implicitly whenever possible, resulting in fine grained threads. Oth- ers construct threads by explicit user specifications in the program, resulting in coarse grained threads. How these two mechanisms im- pact performance remains an open question. Implicitly constructed fine grained threads exhibit more overhead due to additional thread scheduling, thread communication, and thread context switches. However, they also increase the flexibility in scheduling. There- fore, computation resources can be utilized further and workloads are more balanced among cores. Moreover, if scheduled properly, concurrent fine grained threads may exhibit more data affinity than coarse grained threads. In most parallel architectures, the last- level cache is typically shared among all the cores. Therefore, it is exposed to contention and pollution due to concurrent threads. As a result, data sharing becomes important. A greater degree of data sharing among threads results in fewer last-level cache misses, which is one of the main latencies for a multithreaded process. The data-sharing behavior among the threads depends on how the ap- plications are parallelized and how the threads are scheduled. The complex nature of many applications leads to nested structures in the call graph, and concurrency can be found from a course grained level to a fine grained level. In this project, we compare the data sharing behavior of coarse grained threads and fine grained threads, and evaluate their performance on a CMP cache simulator.
All rights reserved (no additional license for public reuse)
Meng, Jiayuan, Dee Weikle, and Kim Hazelwood. "ParaWeaver: Performance Evaluation on Programming Models for Fine Grained Threads." University of Virginia Dept. of Computer Science Tech Report (2007).
University of Virginia, Department of Computer Science