One way to improve reliability in parallel computers consists in adding supplementary processors and interconnections to the functional structure in order to replace faulty processors with respect to the network structure. This approach is named the Structural Fault Tolerance (SFT). Very integrated parallel computers are one way to implement a parallel structure. The material structure is then composed of many Elementary Blocks (EB) such as ASICs or MCMs, each containing many processors. We will show that former SFT methods fail in combining the different features, constraints and requirements of such structures. Then, this paper introduces a new reconfiguration approach dedicated to very integrated parallel computers.
Citation:
Fabien Clermidy, Thierry Collette, Michael Nicolaïdis, "A New Placement Algorithm Dedicated to Parallel Computers: Bases and Application," prdc, pp.242, Sixth Pacific Rim International Symposium on Dependable Computing (PRDC'99), 1999