Submission Procedure
Volume 20 / Issue 9

DOI:   10.3217/jucs-020-09-1351


Extending an Application-Level Checkpointing Tool to Provide Fault Tolerance Support to OpenMP Applications

Nuria Losada (University of A Coruña, Spain)

María J. Martín (University of A Coruña, Spain)

Gabriel Rodríguez (University of A Coruña, Spain)

Patricia González (University of A Coruña, Spain)

Abstract: Despite the increasing popularity of shared-memory systems, there is a lack of tools for providing fault tolerance support to shared-memory applications. CPPC (ComPiler for Portable Checkpointing) is an application-level checkpointing tool focused on the insertion of fault tolerance into long-running MPI applications. This paper presents an extension to CPPC to allow the checkpointing of OpenMP applications. The proposed solution maintains the main characteristics of CPPC: portability and reduced checkpoint file size. The performance of the proposal is evaluated using the OpenMP NAS Parallel Benchmarks showing that most of the applications present small checkpoint overheads.

Keywords: OpenMP, checkpointing, fault tolerance, parallel programming

Categories: D.1.3, D.4.5