TR-2020-13
Bayesian Learning for Hardware and Software Configuration Co-Optimization
Yi Ding; Ahsan Pervaiz; Sanjay Krishnan; Henry Hoffmann. 20 December, 2020.
Communicated by Henry Hoffmann.
Abstract
Both hardware and software systems are increasingly
configurable, which poses a challenge to finding the highest
performance configuration due to the tremendous search space.
Prior scheduling and resource management work uses machine
learning (ML) to find high-performance configurations for either
the hardware or software configurations alone while assuming
the other is fixed. Such separate optimization is problematic
because a software configuration that is fast on one hardware
architecture can be up to 50% slower on another. The main
difficulty of co-optimization is the massive search space over
both hardware and software configurations because accurate
learning models require large amounts of labeled training data,
and thus long periods of data collection and configuration search.
To achieve co-optimization with significantly smaller training
sets, we present Paprika, a scheduler that simultaneously selects
hardware and software configurations using a Bayesian learning
approach. Paprika augments software configuration parameters
with hardware features seamlessly and efficiently via one-hot
encoding and parameter selection. To reduce the impact of the
search space, Paprika actively queries configurations based on
a novel ensemble optimization objective. The intuition of this
ensemble is to combine three prior optimizers to implicitly negotiate
the exploration–exploitation tradeoff. We evaluate Paprika
with ten Spark workloads on three hardware architectures and
find that, compared to prior work, Paprika produces runtimes
that are 12–38% closer to the optimal.
Original Document
The original document is available in PDF (uploaded 20 December, 2020 by
Henry Hoffmann).