Kee-Wook Rim, Dept. of Comput. Sci., Inchon Univ., South Korea
Abstract: Hardware approaches to fault-tolerance in designing a scalable multiprocessor system are discussed. Each node is designed to support multi-level fault-tolerance enabling a user to choose the level of fault-tolerance with a possible resource or performance penalty. Various tree-type interconnection networks using switches are compared in terms of reliability, latency, and implementation complexity. A practical duplicate interconnection network with the increased reliability is proposed in consideration of implementation issues under the physical constraint.
Index Terms:
multiprocessing systems; fault tolerant computing; multiprocessor interconnection networks; computational complexity; multiprocessor system; extended fault tolerance; scalable multiprocessor system; performance penalty; tree-type interconnection networks; reliability; latency; implementation complexity
Citation:
Byoung-Joon Min, Sang-Seok Shin, Kee-Wook Rim, "Design and analysis of a multiprocessor system with extended fault tolerance," ftdcs, pp.0301, 5th IEEE Workshop on Future Trends of Distributed Computing Systems, 1995