Recently, there has been some talk about a concept called cloud diversity. Cloud diversity is an old concept with a new name. Prior to the cloud, when we had servers in datacenters that were managed by the business, a lot of resources were focused on the redundancy of the infrastructure. Applications were built on a strong foundation, leveraging redundant servers, highly available storage and architectures that removed any single point of failure. Often systems were clustered for high availability, and the disaster recovery site was planned to in a different region so that it was isolated from any natural or manmade disaster that would impact the primary site.
This changed when the enterprise started to move to the cloud. Enterprises moved IaaS workloads based solely on price, and assumed that the cloud provider would provide all the redundancy and change control processes that were the best practices. This is when things have started to get difficult.
Due to human error, dumb luck and other factors, the large cloud providers have started to experience outages. Amazon Web Services (AWS) experienced a huge outage in February of 2017, taking many services off line globally. This outage was blamed on an error made by a single AWS employee. Later in 2017 a second outage took AWS East coast services offline in June. This is not unique to AWS, Azure has also had it’s share of outages, including a global storage outage in March of 2017. Other smaller cloud providers have also experienced some major outages.
Outages will unfortunately happen, with the high levels of automation amplifying the impact of the outages. Prior to the Cloud, when am admin made a mistake, or a hardware component failed only a few applications were impacted, often as small as a single application. Now a single admin’s mistake can potentially impact tens of thousands of Virtual Machines, within seconds of the change.
In order to protect against this at an enterprise level, the Enterprise needs to consider how diverse is their cloud architecture when moving to the cloud. Having two cloud vendors is now an important design consideration when architecting a cloud solution. If you’re moving to the cloud, consider a design that leverages at least two different cloud providers, or a cloud provider that has complete isolation between datacenters.