Process Introspection: A Checkpoint Mechanism for High Performance Heterogeneous Distributed Systems

Report
Author:Ferrari, Adam, Department of Computer ScienceUniversity of Virginia
Abstract:

The Process Introspection project is a design and implementation effort, the main goal of which is to construct a general purpose, flexible, efficient checkpoint/restart mechanism appropriate for use in high performance heterogeneous distributed systems. This checkpoint/restart mechanism has the primary constraint that it must be platform independent; that is, checkpoints produced on one architecture or operating system platform must be restartable on a different architecture or operating system platform. The Process Introspection mechanism is based on a design pattern for constructing interoperable checkpointable modules. Application of the design pattern is automated by two levels of software tools: a library of support routines that facilitate the use of the design pattern, and a source code translator that automatically applies the pattern to platform independent modules. A prototype implementation of library has been constructed and used to demonstrate that the design pattern can be applied effectively to construct platform independent checkpointable programs that operate efficiently.

Rights:
All rights reserved (no additional license for public reuse)
Language:
English
Source Citation:

Ferrari, Adam. "Process Introspection: A Checkpoint Mechanism for High Performance Heterogeneous Distributed Systems." University of Virginia Dept. of Computer Science Tech Report (1996).

Publisher:
University of Virginia, Department of Computer Science
Published Date:
1996