TR-2018-02

GraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines

Zhixuan Zhou; Henry Hoffmann. 21 January, 2018.
Communicated by Henry Hoffmann.

Abstract

Recent programming frameworks enable small computing systems to achieve high performance on large-scale graph analytics by supporting out-of-core graph analytics (i.e., processing graphs that exceed main memory capacity). These frameworks make out-of-core programming easy by automating the tedious process of scheduling data transfer between memory and disk. This paper presents two innovations that improve the performance of software frameworks for out-of-core graph analytics. The first is degree-ordered storage, a new storage format that dramatically lowers book-keeping overhead when graphs are larger than memory. The second innovation replaces existing static messages with novel ordered dynamic messages which update their destination immediately, reducing both the memory required for intermediate storage and IO pressure.We implement these innovations in a framework called GraphZ—which we release as open source—and we compare its performance to two state-of-the-art out-of-core graph frameworks. For graphs that exceed memory size, GraphZ’s harmonic mean performance improvements are 1:8–8:3 over existing state-of-the-art solutions. In addition, GraphZ’s reduced IO greatly reduces power consumption, resulting in tremendous energy savings.

Original Document

The original document is available in PDF (uploaded 21 January, 2018 by Henry Hoffmann).