Evaluating the Energy Efficiency of Trace Caches

Authors:Co, Michelle, Department of Computer ScienceUniversity of Virginia Skadron, Kevin, Department of Computer ScienceUniversity of Virginia

Sequential trace caches are highly energy and power-efficient. Fetch engines which include a sequential trace cache provide higher performance for approximately equal area at a significant energy and power savings. The results of our preliminary experiments show that sequential trace caches are a power-efficient design. Previous work has evaluated the trace cache design space with respect to performance. In addition, some previous work has evaluated power-efficiency techniques for trace caches. This work evaluates the trace cache design space considering not only performance but also energy and power. In addition, we compare fetch engine designs which include trace caches with fetch engine designs have instruction caches only. We perform a set of fetch engine area and associativity experiments as well as a next trace predictor design space exploration. We find that when examining performance and average fetch power, fetch engines with trace caches may not seem appealing, but when examining energy-delay and energy-delay-squared, the benefits of a trace cache become clear. Even if average fetch power is increased due to the increased fetch engine area, the energy-efficiency is still improves with a trace cache due to faster execution and more opportunities for clock gating, making the trace cache superior in terms of energy-delay and energy-delay-squared products. Results of current experiments show that sequential trace cache designs compare very favorably to instruction-cache-only designs with respect to power and energy consumption. Our preliminary results show that overall sequential trace caches clearly outperform instruction-cache-only designs with better energy-efficiency. In examining the best design points of the fetch engines examined, a 343KB, 4-way set associative trace cache fetch engine outperforms a 292KB instruction-cacheonly fetch engine by 5% for integer benchmarks and 1% for floating point benchmarks. In addition, it does so using 68.3% less average fetch power, 70.3% less energy, 67.7% less energy-delay, and 69.1% less energy-delay-squared than a 292KB,
Note: Abstract extracted from PDF text

All rights reserved (no additional license for public reuse)
Source Citation:

Co, Michelle, and Kevin Skadron. "Evaluating the Energy Efficiency of Trace Caches." University of Virginia Dept. of Computer Science Tech Report (2003).

University of Virginia, Department of Computer Science
Published Date: