Storage Benchmarking with Deep Learning Workloads

Peng Cheng; Haryadi S. Gunawi. 8 January, 2021.
Communicated by Haryadi Gunawi.


In this AI-driven era, while compute infrastructure is often the fo- cus, storage is equally important. Loading many batches of sample randomly is the common workload for deep learning (DL). Those iterative small random reads impose nontrivial I/O pressure on stor- age systems. Therefore, we are interested in exploring the optimal storage system and data format for storing DL data and the possi- ble trade-offs. In the meantime, object storage is usually preferred because of its high scalability, rich metadata, competitive cost, and the ability to store unstructured data. This motivates us to benchmark two object storage systems: MinIO and Ceph. As a comparison, we also benchmark three popu- lar key-value storage databases: MongoDB, Redis, and Cassandra. We explore the impact of different parameters, including storage location, storage disaggregation granularity, access pattern, and data format. For each parameter, we summarize the benchmark results and give some suggestions. Overall, although the optimal storage system is workload-specific, our benchmarks provide some insights on how to reach it.

Original Document

The original document is available in PDF (uploaded 8 January, 2021 by Haryadi Gunawi).