Improving Instruction-level Parallelism by Loop Unrolling and Dynamic Memory DisambiguationReport
Exploitation of instruction-level parallelism is an effective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be applied to increase instruction-level parallelism. This paper describes and evaluates a software technique, dynamic memory disambiguation, that permits loops containing loads and stores to be scheduled more aggressively, thereby exposing more instruction-level parallelism. The results of our evaluation show that when dynamic memory disambiguation is applied in conjunction with loop unrolling, register renaming, and static memory disambiguation, the ILP of memory-intensive benchmarks can be increased by as much as 300 percent over loops where dynamic memory disambiguation is not performed. Our measurements also indicate that for the programs that benefit the most from these optimizations, the register usage does not exceed the number of registers on most high-performance processors.
All rights reserved (no additional license for public reuse)
Davidson, Jack, and Sanjay Jinturkar. "Improving Instruction-level Parallelism by Loop Unrolling and Dynamic Memory Disambiguation." University of Virginia Dept. of Computer Science Tech Report (1995).
University of Virginia, Department of Computer Science