Failure Prediction in Computational Grids

Authors:Kang, Woochul, Department of Computer ScienceUniversity of Virginia Grimshaw, Andrew, Department of Computer ScienceUniversity of Virginia

Accurate failure prediction in Grids is critical for reasoning about QoS guarantees such as job completion time and availability. Statistical methods can be used but they suffer from the fact that they are based on assumptions, such as time-homogeneity, that are often not true. In particular, periodic failures are not modeled well by statistical methods. In this paper, we present an alternative mechanism for failure prediction in which periodic failures are first determined and then filtered from the failure list. The remaining failures are then used in a traditional statistical method. We show that the use of pre- filtering leads to an order of magnitude better predictions.

All rights reserved (no additional license for public reuse)
Source Citation:

Kang, Woochul, and Andrew Grimshaw. "Failure Prediction in Computational Grids." University of Virginia Dept. of Computer Science Tech Report (2006).

University of Virginia, Department of Computer Science
Published Date: