loading...
Checkpointing an Recovery of Share Memory Parallel Applications in a Cluster
Tokyo, Japan May 12-May 15
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CCGRID.2003.1199403Third IEEE International Symposium on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Ramamurthy Badrinath, IRISA/INRIA
Christine Morin, IRISA/INRIA
This paper describes issues in the design and implementation of checkpointing and recovery modules for the Kerrighed DSM cluster system. Our design is for a DSM supporting the sequential consistency model. The mechanisms are general enough to be used in a number of different checkpointing and recovery protocols. It is designed to support common optimizations for performance suggested in literature, while staying light-weight during fault-free execution. We also present preliminary performance results of the current implementation.
Citation:
Ramamurthy Badrinath, Christine Morin, Geoffroy Vallée, "Checkpointing an Recovery of Share Memory Parallel Applications in a Cluster," ccgrid, pp.471, Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.