loading...
Store Memory-Level Parallelism Optimizations for Commercial Applications
Barcelona, Spain November 12-November 16
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MICRO.2005.3138th Annual IEEE/ACM International Sy ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Yuan Chou, Sun Microsystems
Lawrence Spracklen, Sun Microsystems
Santosh G. Abraham, Sun Microsystems

This paper studies the impact of off-chip store misses on processor performance for modern commercial applications. The performance impact of off-chip store misses is largely determined by the extent of their overlap with other off-chip cache misses. The epoch MLP model is used to explain and quantify how these overlaps are affected by various store handling optimizations and by the memory consistency model implemented by the processor. The extent of these overlaps are then translated to off-chip CPI. Experimental results show that store handling optimizations are crucial for mitigating the substantial performance impact of stores in commercial applications. While some previously proposed optimizations, such as store prefetching, are highly effective, they are unable to fully mitigate the performance impact of off-chip store misses and they also leave a performance gap between the stronger and weaker memory consistency models. New optimizations, such as the Store Miss Accelerator, an optimization of Hardware Scout and a new application of Speculative Lock Elision, are demonstrated to virtually eliminate the impact of off-chip store misses.

Citation:
Yuan Chou, Lawrence Spracklen, Santosh G. Abraham, "Store Memory-Level Parallelism Optimizations for Commercial Applications," micro, pp.183-196, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.