loading...
Recovery Schemes for High Availability and High Performance Distributed Real-Time Computing
Nice, France April 22-April 26
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IPDPS.2003.1213241International Parallel and Distribute ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Lars Lundberg, Blekinge Institute of Technology
Daniel Häggander, Blekinge Institute of Technology
Kamilla Klonowska, Blekinge Institute of Technology
Charlie Svahnberg, Blekinge Institute of Technology
Clusters and distributed systems offer fault tolerance and high performance through load sharing, and are thus attractive in real-time applications. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers fail the must be redistributed. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e. we want to optimize the worst-case behavior. In this paper we define recovery schemes, which are optimal for a number of important cases. We also show that the problem of finding optimal recovery schemes corresponds to the mathematical problem of finding sequences of integers with minimal sum and for which all sums of subsequences are unique.
Citation:
Lars Lundberg, Daniel Häggander, Kamilla Klonowska, Charlie Svahnberg, "Recovery Schemes for High Availability and High Performance Distributed Real-Time Computing," ipdps, pp.122a, International Parallel and Distributed Processing Symposium (IPDPS'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.