Why Does the Cloud Stop Computing? Lessons from Hundreds of Service Outages

Haryadi S. Gunawi; Agung Laksono; Riza O. Suminto; Mingzhe Hao; Jeffry Adityatama; Kurnia J. Eliazar; Anang D. Satria. 2 February, 2016.
Communicated by Haryadi Gunawi.


We conducted a cloud outage study (COS) of 32 popular Internet services including chat, e-commerce, email, game, IaaS/PaaS, SaaS, social, storage, and video sharing and streaming services. We analyzed 1111 headline news and public post-mortem reports that detail 516 unplanned outages that occurred within a 6-year span from 1/1/2009 to 12/31/2014. We analyzed outage duration, root causes, impacts, and fix procedures. This study reveals the broader availability landscape of modern cloud services and provides answers to why outages still take place even with pervasive redundancies.

Original Document

The original document is available in PDF (uploaded 2 February, 2016 by Haryadi Gunawi).