IONet: Towards an Open Machine Learning Training Ground for I/O Performance Prediction

Daniar H. Kurniawan; Levent Toksoz; Anirudh Badam; Tim Emami; Sandeep Madireddy; Robert B. Ross; Henry Hoffmann; Haryadi S. Gunawi,. 19 January, 2021.
Communicated by Haryadi Gunawi.


Low and stable latency is a critical key to the success of many services, but variable load and resource sharing in a mod- ern cloud environment introduces resource contention that in turn increases the unpredictability of the systems which often cause a ”tail latency problem.” As one of the main building- blocks of a complex request-chain, understanding the I/O re- quest becomes an important topic to help parallel storage applications achieve performance predictability and to re- duce the tail latency. This paper presents IONET, ML-based per-I/O latency predictor capable of achieving 80-97% infer- ence accuracy and sub-10μs inference overhead for each I/O. IONET’s light-weight NN models demonstrate that this line of research is practical and incorporating the models inside operating systems for real-time decision-making is a feasible solution to achieve latency stable systems.

Original Document

The original document is available in PDF (uploaded 19 January, 2021 by Haryadi Gunawi).