Application-Specific Pipelines for Exploiting Instruction-Level ParallelismReport
Application-specific processor design is a promising approach for meeting the performance and cost goals of a system. Application-specific processors are especially promising for embedded systems (e.g., automobile control systems, avionics, cellular phones, etc.) where a small increase in performance and decrease in cost can have a large impact on a product's viability. Sutherland, Sproull, and Molnar have proposed a new pipeline organization called the Counterflow Pipeline (CFP). This paper shows that the CFP is an ideal architecture for fast, low-cost design of high-performance processors customized for computation-intensive embedded applications. First, we describe why CFP's are particularly well suited to realizing application-specific processors. Second, we describe how a CFP tailored to an application can be constructed automatically. Third, we present measurements that show CFP's elegantly and simply provide speculative execution, out-of-order execution, and register renaming that is matched to the application. These measurements show that CFP's speculative and out-of-order execution allow it to tolerate frequent control dependences and high-latency operation such as memory accesses. Finally, we show that asynchronous counterflow pipelines may achieve very high-performance by reducing the average execution latency of instructions over synchronous implementations. Application speedups of up to 7.8 are achieved using custom counterflow pipelines for several well-known kernel loops.
All rights reserved (no additional license for public reuse)
Childers, Bruce, and Jack Davidson. "Application-Specific Pipelines for Exploiting Instruction-Level Parallelism." University of Virginia Dept. of Computer Science Tech Report (1998).
University of Virginia, Department of Computer Science