TR-2020-13

Bayesian Learning for Hardware and Software Configuration Co-Optimization

Yi Ding; Ahsan Pervaiz; Sanjay Krishnan; Henry Hoffmann. 20 December, 2020.
Communicated by Henry Hoffmann.

Abstract

Both hardware and software systems are increasingly configurable, which poses a challenge to finding the highest performance configuration due to the tremendous search space. Prior scheduling and resource management work uses machine learning (ML) to find high-performance configurations for either the hardware or software configurations alone while assuming the other is fixed. Such separate optimization is problematic because a software configuration that is fast on one hardware architecture can be up to 50% slower on another. The main difficulty of co-optimization is the massive search space over both hardware and software configurations because accurate learning models require large amounts of labeled training data, and thus long periods of data collection and configuration search. To achieve co-optimization with significantly smaller training sets, we present Paprika, a scheduler that simultaneously selects hardware and software configurations using a Bayesian learning approach. Paprika augments software configuration parameters with hardware features seamlessly and efficiently via one-hot encoding and parameter selection. To reduce the impact of the search space, Paprika actively queries configurations based on a novel ensemble optimization objective. The intuition of this ensemble is to combine three prior optimizers to implicitly negotiate the exploration–exploitation tradeoff. We evaluate Paprika with ten Spark workloads on three hardware architectures and find that, compared to prior work, Paprika produces runtimes that are 12–38% closer to the optimal.

Original Document

The original document is available in PDF (uploaded 20 December, 2020 by Henry Hoffmann).