As technology continues to scale, future multicore processors become more susceptible to a variety of hardware failures. In particular, intermittent faults, are expected to become especially problematic [1, 2]. A circuit is susceptible to intermittent faults when manufacturing process variation or in-progress wear-out causes the parameters (e.g., resistance, threshold voltage, etc.) of devices within the circuit to vary beyond design expectations [2]. This susceptibility, combined with certain operating conditions, such as thermal hot-spots and voltage fluctuations, can result in timing errors--even if these temperatures and voltages, for example, are well within the specified "acceptable" margins.
Citation:
Philip M. Wells, Koushik Chakraborty, Gurindar S. Sohi, "Adapting to Intermittent Faults in Future Multicore Systems," pact, pp.431, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007