loading...
Managing MPICH-G2 Jobs with WebCom-G
Universit? of Lille 1, France July 04-July 06
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ISPDC.2005.34The 4th International Symposium on Pa ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Padraig J. O'Dowd, University College Cork,Ireland
Adarsh Patil, University College Cork,Ireland
John P. Morrison, University College Cork,Ireland
This paper discusses the use of WebCom-G to handle the management & scheduling of MPICH-G2 (MPI) jobs. Users can submit their MPI applications to a WebCom- G portal via a web interface. WebCom-G will then select the machines to execute the application on, depending on the machines available to it and the number of machines requested by the user. WebCom-G automatically & dynamically constructs a RSL script with the selected machines and schedules the job for execution on these machines. Once the MPI application has finished executing, results are stored on the portal server, where the user can collect them. A main advantage of this system is fault survival, if any of the machines fail during the execution of a job, WebCom-G can automatically handle such failures. Following a machine failure, WebCom-G can create a new RSL script with the failed machines removed, incorporate new machines (if they are available) to replace the failed ones and re-launch the job without any intervention from the user. The probability of failures in a Grid environment is high, so fault survival becomes an important issue.
Index Terms:
WebCom-GI Globus, MPICH-G2, MPI, Grid Portals, Scheduling and Fault Survival.
Citation:
Padraig J. O'Dowd, Adarsh Patil, John P. Morrison, "Managing MPICH-G2 Jobs with WebCom-G," ispdc, pp.258-266, The 4th International Symposium on Parallel and Distributed Computing (ISPDC'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.