A checkpoint pattern is an abstraction of the computation performed by a distributed application. A progressive view of this abstraction is formed by a sequence of consistent global checkpoints that may have occurred in this order during the execution of the application. Considering pairs of checkpoints, we have determined that a checkpoint must be observed before another in a progressive view if the former Z-precedes the latter. Based on the Z-precedence and characteristics of the checkpoint pattern, we propose original algorithms for the progressive construction of consistent global checkpoints. We demonstrate that the Z-precedence between a pair of checkpoints is a much simpler way to express the existence of a zigzag path connecting them, and we discuss other advantages of our relation.
Index Terms:
distributed checkpointing, consistent global states, causality, zigzag paths, monitoring systems.
Citation:
Islene Calciolari Garcia, Luiz Eduardo Buzato, "Progressive Construction of Consistent Global Checkpoints," icdcs, pp.0055, 19th IEEE International Conference on Distributed Computing Systems (ICDCS'99), 1999