[ main ] [ back ]

105/2004 : An efficient failure detector for sparsely connected networks

RR Number
105/2004
Conference
Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 2004), Innsbruck, Austria
Author(s)
Martin Hutle
Abstract
We present an implementation of an eventually perfect failure detector for sparsely connected, partitionable networks, where each process has only a bounded number of neighbors. Processes and links may fail by crashing. Regarding synchrony, our algorithm only needs to know an upper bound on the jitter of the communication between direct neighbors. No a-priori knowledge about the number of processes in the system is required. The algorithm uses heartbeats to determine whether a process is in the same partition. By reducing the frequency of forwards by distance, information about nearer processes is more accurate than about farther ones, and the message size becomes constant. Since this property can be guaranteed independently of the number of processes in the system, our failure detector is very efficient in terms of communication complexity.
Bibtex
@article{ hutle:2004-105,
  author =       "Martin Hutle",
  title =        "An efficient failure detector for sparsely connected networks",
  journal =      "Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 2004), Innsbruck, Austria",
  year =         "2004",
  month =        "Feb."
}
Download

[ main ] [ back ]