loading...
Combining Source Routing and Dynamic Fault Tolerance
Ouro Preto, MG, Brazil October 17-October 20
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SBAC-PAD.2006.1218th International Symposium on Compu ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Frank Olaf Sem-Jacobsen, Simula Research Laboratory, Norway
Olav Lysne, Simula Research Laboratory, Norway
Tor Skeie, Simula Research Laboratory, Norway
An increasing amount of current and emerging interconnect technologies rely on source routing to forward packets through the network. It is therefore important to develop methods for fault tolerance that are well suited for source routed networks. Dynamic fault tolerance allows the network to remain available through the occurrence of faults, as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Source routing readily supports the source node choosing a different path when a fault occurs, but using this approach, packets already in the network will be lost. Local dynamic fault tolerance, where the packet is routed around the fault locally, would prevent much of the traffic being lost during failures, but this is cumbersome to achieve in source routed networks since packets encountering a fault will need to follow a path different from that encoded in the packet header. In this paper we present a mechanism to achieve local dynamic fault tolerance in source routed fat trees, a topology that has widespread use in supercomputer systems, and compare it with endpoint dynamic fault tolerance. We also show that by combining the two approaches we achieve performance superior to any of the two individually.
Citation:
Frank Olaf Sem-Jacobsen, Olav Lysne, Tor Skeie, "Combining Source Routing and Dynamic Fault Tolerance," sbac-pad, pp.151-158, 18th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.