TR-2014-16

Log-Structured Global Array for Efficient Multi-version Snapshots

Hajime Fujita; Nan Dun; Zachary Rubenstein; Andrew Chien. 4 November, 2014.
Communicated by Andrew Chien.

Abstract

In exascale systems, increasing error rates— particularly silent data corruption—are a major concern. The Global View Resilience (GVR) system builds a a new model of application resilience on versioned arrays. These arrays can used exploited for flexible, application-specific error checking and recovery. We explore a fundamental challenge to the GVR model – the cost of versioning. We propose a novel log-structured implementation that appends new data to an update log, simulta- neously tracking modified regions and versioning incrementally. We compare performance of log-structured to traditional flat arrays using micro-benchmarks and several full applications, and show that versioning can be more than 10x faster, and reduce memory size significantly. Further, in future systems with NVRAM, a log-structured approach is more tolerant of NVRAM limitations such as write bandwidth and wear-out.

Original Document

The original document is available in PDF (uploaded 4 November, 2014 by Andrew Chien).