Recent work has shown that multithreaded workloads running in execution-driven, full-system simulation environments cannot use instructions per cycle (IPC) as a valid performance metric due to non-deterministic program behavior. Unfortunately, invalidating IPC as a performance metric introduces its own host of difficulties: special workload setup, consideration of cold-start and end-effects, statistical methodologies leading to increased simulation bandwidth, and workload-specific, higher-level metrics to measure performance. This paper explores the non-determinism problem in multithreaded programs, describes a method to eliminate non-determinism across simulations of different experimental machine models, and demonstrates the suitability of this methodology for performing architectural performance analysis, thus redeeming IPC as a performance metric for multithreaded programs.
Citation:
Kevin M. Lepak, Harold W. Cain, Mikko H. Lipasti, "Redeeming IPC as a Performance Metric for Multithreaded Programs," pact, pp.232, 12th International Conference on Parallel Architectures and Compilation Techniques (PACT'03), 2003