loading...
Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors
San Jose, California March 20-March 24
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CGO.2004.1281661International Symposium on Code Gener ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Dongkeun Kim, Intel Corporation; University of Maryland at College Park
Steve Shih-wei Liao, Intel Corporation
Perry H. Wang, Intel Corporation
Juan del Cuvillo, Intel Corporation
Xinmin Tian, Intel Corporation
Xiang Zou, Intel Corporation
Hong Wang, Intel Corporation
Donald Yeung, University of Maryland at College Park
Milind Girkar, Intel Corporation
John P. Shen, Intel Corporation
Pre-execution techniques have received much attention as an effective way of prefetching cache blocks to tolerate the ever-increasing memory latency. A number of pre-execution techniques based on hardware, compiler, or both have been proposed and studied extensively by researchers. They report promising results on simulators that model a Simultaneous Multithreading (SMT) processor. In this paper, we apply the helper threading idea on a real multithreaded machine, i.e., Intel Pentium 4 processor with Hyper-Threading Technology, and show that indeed it can provide wall-clock speedup on real silicon. To achieve further performance improvements via helper threads, we investigate three helper threading scenarios that are driven by automated compiler infrastructure, and identify several key challenges and opportunities for novel hardware and software optimizations. Our study shows a program behavior changes dynamically during execution. In addition, the organizations of certain critical hardware structures in the hyper-threaded processors are either shared or partitioned in the multi-threading mode and thus, the tradeoffs regarding resource contention can be intricate. Therefore, it is essential to judiciously invoke helper threads by adapting to the dynamic program behavior so that we can alleviate potential performance degradation due to resource contention. Moreover, since adapting to the dynamic behavior requires frequent thread synchronization, having light-weight thread synchronization mechanisms is important.
Citation:
Dongkeun Kim, Steve Shih-wei Liao, Perry H. Wang, Juan del Cuvillo, Xinmin Tian, Xiang Zou, Hong Wang, Donald Yeung, Milind Girkar, John P. Shen, "Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors," cgo, pp.27 , International Symposium on Code Generation and Optimization (CGO'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.


Suggestions