Modeling and Single-Pass Simulation of CMP Cache Capacity and Accessibility
|
| San Jose, CA April 25-April 27 |
null Xudong Shi, Dept. of Comput.&Inf. Sci.&Eng., Florida Univ., Gainesville, FL
null Feiqi Su, Dept. of Comput.&Inf. Sci.&Eng., Florida Univ., Gainesville, FL
null Ye Xia, Dept. of Comput.&Inf. Sci.&Eng., Florida Univ., Gainesville, FL
null Zhen Yang, Dept. of Comput.&Inf. Sci.&Eng., Florida Univ., Gainesville, FL
The future chip-multiprocessors (CMPs) with a large number of cores faces difficult issues in efficient utilizing on-chip storage space. Tradeoffs between data accessibility and effective on-chip capacity have been studied extensively. It requires costly simulations to understand a wide-spectrum of design spaces. In this paper, we first develop an abstract model for understanding the performance impact with respect to the degree of data replication. To overcome the lack of real-time interactions among multiple cores in the abstract model, we propose an efficient single-pass stack simulation method to study the performance of a variety of cache organizations on CMPs. The proposed global stack logically incorporates a shared stack and per-core private stacks to collect shared/private reuse (stack) distances for every memory reference in a single simulation pass. With the collected reuse distances, performance in terms of hits/misses and average memory access times can be calculated for multiple cache organizations. The basic stack simulation results can further derive other CMP cache organizations with various degrees of data replication. We verify both the modeling and the stack results against individual execution-driven simulations that consider realistic cache parameters and delays using a set of commercial multithreaded workloads. We also compare the simulation time saving with the stack simulation. The results show that stack simulation can accurately model the performance of various studied cache organizations with 2-9% error margins using only about 8% of the simulation time. The results also show that the effectiveness of various techniques for optimizing the CMP on-chip storage is closely related to the working sets of the workloads as well as the total cache sizes
Index Terms:
multiple cache organization, single-pass simulation, chip-multiprocessor, on-chip storage space, data accessibility, on-chip cache capacity, abstract model, data replication, single-pass stack simulation, global stack, shared stack, per-core private stack, single simulation pass, reuse distances, average memory access time
Citation:
null Xudong Shi, null Feiqi Su, null Jih-Kwon Peir, null Ye Xia, null Zhen Yang, "Modeling and Single-Pass Simulation of CMP Cache Capacity and Accessibility," ispass, pp.126-135, 2007 IEEE International Symposium on Performance Analysis of Systems&Software, 2007
Usage of this product signifies your acceptance of the
Terms of Use.
|
|
|
|
|
|
|
|