loading...
Dynamic Fault Tolerance with Misrouting in Fat Trees
Columbus, Ohio August 14-August 18
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICPP.2006.362006 International Conference on Para ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Frank Olaf Sem-Jacobsen, University of Oslo, Norway
Tor Skeie, Simula Research Laboratory, Norway
Olav Lysne, Simula Research Laboratory, Norway
Jose Duato, Universidad Politecnica de Valencia, Spain
Fault tolerance is critical for efficient utilisation of large computer systems. Dynamic fault tolerance allows the network to remain available through the occurance of faults as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Although dynamic fault tolerance may lead to less efficient solutions than static fault tolerance, it allows for a much higher availability of the system. In this paper we devise a dynamic fault tolerant adaptive routing algorithm for the fat tree, a much used interconnect topology, which relies on misrouting around link faults. We show that we are guaranteed to tolerate any combination of less than num switch ports/2 link faults without the need for additional network resources for deadlock freedom. There is also a high probability of tolerating an even larger number of link faults. Simulation results show that network performance degrades very little when faults are dynamically tolerated.
Citation:
Frank Olaf Sem-Jacobsen, Tor Skeie, Olav Lysne, Jose Duato, "Dynamic Fault Tolerance with Misrouting in Fat Trees," icpp, pp.33-44, 2006 International Conference on Parallel Processing (ICPP'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.