Wei Li, Intel Corporation, Santa Clara, CA
In this paper, we evaluate the benefits achievable from software data-prefetching techniques for OpenMP* C/C++ and Fortran benchmark programs, using the framework of the Intel production compiler for the Intel? Itanium? 2 processor. Prior work on software data-prefetching study has primarily focused on benchmark performance in the context of a few software data-prefetching schemes developed in research compilers. In contrast, our study is to examine the impact of an extensive set of software data-prefetching schemes on the performance of multi-threaded execution using a full set of SPEC OMPM2001 applications with a product compiler on a commercial multiprocessor system. This paper presents performance results showing that compiler-based software data-prefetching supported in the Intel compiler results in significant performance gain, viz., 11.88% to 99.85% gain for 6 out of 11 applications, 3.83% to 6.96% gain for 4 out of 11 applications, with only one application obtaining less than 1% gain on an IntelR Itanium? 2 processor based SGI Altix* 32-way sharedmemory multiprocessor system.
Index Terms:
Thread-level parallelism, prefetching, OpenMP, compiler optimization, performance evaluation
Citation:
Xinmin Tian, Rakesh Krishnaiyer, Hideki Saito, Milind Girkar, Wei Li, "Impact of Compiler-based Data-Prefetching Techniques on SPEC OMP Application Performance," ipdps, vol. 1, pp.53a, 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers, 2005