TR-2018-02
GraphZ: Improving the Performance of Large-Scale Graph Analytics on Small-Scale Machines
Zhixuan Zhou; Henry Hoffmann. 21 January, 2018.
Communicated by Henry Hoffmann.
Abstract
Recent programming frameworks enable small computing
systems to achieve high performance on large-scale
graph analytics by supporting out-of-core graph analytics (i.e.,
processing graphs that exceed main memory capacity). These
frameworks make out-of-core programming easy by automating
the tedious process of scheduling data transfer between memory
and disk. This paper presents two innovations that improve
the performance of software frameworks for out-of-core graph
analytics. The first is degree-ordered storage, a new storage format
that dramatically lowers book-keeping overhead when graphs
are larger than memory. The second innovation replaces existing
static messages with novel ordered dynamic messages which
update their destination immediately, reducing both the memory
required for intermediate storage and IO pressure.We implement
these innovations in a framework called GraphZ—which we
release as open source—and we compare its performance to
two state-of-the-art out-of-core graph frameworks. For graphs
that exceed memory size, GraphZ’s harmonic mean performance
improvements are 1:8–8:3 over existing state-of-the-art solutions.
In addition, GraphZ’s reduced IO greatly reduces power
consumption, resulting in tremendous energy savings.
Original Document
The original document is available in PDF (uploaded 21 January, 2018 by
Henry Hoffmann).