loading...
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
Denver, Colorado April 04-April 08
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IPDPS.2005.14119th IEEE International Parallel and ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Dawid Kurzyniec, Emory University, Atlanta, GA
Vaidy Sunderam, Emory University, Atlanta, GA
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming paradigm for such applications; however, as the size of computing environments increases, fault tolerance aspects become critically important. We argue that the fault tolerance model proposed by FT-MPI fits well in geographically distributed environments, even though its current implementation is confined to a single administrative domain. We propose to overcome these limitations by combining FTMPI with the H2O resource sharing framework. Our approach allows users to run fault tolerant MPI programs on heterogeneous, geographically distributed shared machines, without sacrificing performance and with minimal involvement of resource providers.
Citation:
Dawid Kurzyniec, Vaidy Sunderam, "Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries," ipdps, vol. 2, pp.120a, 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 1, 2005
Usage of this product signifies your acceptance of the Terms of Use.