Some Lessons From AWS Outage

Yesterday’s AWS outage has been buzzing around the tech blogosphere even after 24+ hours. As usual naysayers of cloud are up in the arms trying not to miss the golden opportunity to create FUD and competitors to Amazon are tapping into their misery to push their services. Well, people are tuned to accept this as legitimate strategy in a free market system. Without ranting any further on this or spending time blaming how Amazon botched this up big time, I want to talk about some of the lessons we can learn from this outage.

Before we talk about the lessons learned from this AWS debacle, I want to emphasize one difference between the cloud world and the traditional IT world. In the FUD and noise surrounding the outage, many miss this important advantage in the cloud based world. In traditional IT, there are significant costs associated with any DR plan because you have to provision the additional servers (datacenters) needed for any recovery well in advance. This not only adds significantly to the capital expense, it also adds deeply into the operating expenses. Even if your IT is with a managed provider, you spend lot of money reserving capacity for any possible DR needs. The advantage with the cloud based environment is that if you manage to keep your data backup current in another location, the processing power can be switched on by just swiping your credit card and without any need to either provision ahead of time or wait for a long time after the disaster. This is a very important advantage in the cloud based world and, when disaster strikes, you can recover with minimum monetary pinch (provided the DR plan is solid).

Yesterday’s EC2 outage exposed how many of the startups are running without a proper DR strategy. It is a shame that some of the well funded startups didn’t bother to plan for such eventualities. I guess this outage will teach a good lesson for the startups (and, also, their investors) and prepare them before the next disaster. There are many lessons we can learn from yesterday’s outage but I want to highlight some key ones in this post. After all, CloudAve is one of the well respected blogs on cloud computing and we cannot shy away from talking about a topic which reached even the consumer media.

The following are the key lessons we should learn from the episode:

Even though I don’t like the idea of coding for failure, just do it. When we shop at Walmart, we clearly understand that there is a compromise in the quality while getting goods at low prices. If we want to take advantage of commodity servers based public clouds, there is no option but to code for failure
Now imagine myself to be jumping up and down the stage like Steve Ballmer shouting “DR, DR, DR, DR, ……….”. Well, a proper DR strategy is key to any cloud plans. As I pointed out in the paragraph above, cloud computing offers some cost advantages while planning for disaster recovery. In spite of that advantage, we have seen many businesses getting hit in the AWS outage. There are many reasons why this happened. The picture painted by cloud evangelists (including myself in the past) gave an impression that cloud is fail proof. The higher emphasis on devs over ops gave some kind of complacency to people. They started believing religiously that cloud removes ops from the picture entirely and everything works automagically. All these evangelism driven dogma led people to not worry about DR at all. I am glad that this failure wake people up from any complacency
SLAs are important but what matters is how you have negotiated the compensation. This is one of the reasons I promote federated clouds over consolidation. When you have a handful of infrastructure players, they will not care about compensating for any loss during the outage unless the customers are Fortune 500 companies. We need providers who differentiate their offerings on the basis of how they compensate. In order for this to happen we need large scale competition and not consolidation. Only federated clouds can help in ensuring a marketplace where customers are not screwed because of cloud downtimes
Keep geographical redundancy and proximity to another cloud provider as key mantra while planning your DR strategy

Whether we like it or not, the customers are equally responsible for outages along with the cloud providers. Cloud is not a magic pill that solves the ~~erection~~, sorry, scaling problems without any other worries. As in the case of pills that help in the erection issues, there are some side effects associated with the cloud that helps with the rapid infrastructure scaling. It is important that customers understand the compromises they have to make while taking advantage of the benefits offered by cloud computing. Yesterday’s AWS outage is a good opportunity to take a step back and be realistic about the approach to cloud.

Amazon’s Web Services outage: End of cloud innocence? (zdnet.com)
Many AWS Sites Recover, Some Face Longer Wait (datacenterknowledge.com)
Amazon EC2 Outage Hobbles Websites (informationweek.com)
What Amazon’s Outage Means For Cloud Storage (huffingtonpost.com)
Inside Amazon’s Cloud Disaster (AMZN) (businessinsider.com)
Prolonged Amazon outage takes down sites across Internet (theglobeandmail.com)
Amazon cloud outage derails Reddit, Quora (news.cnet.com)
Amazon cloud outage derails Reddit, Quora (news.cnet.com)

Some Lessons From AWS Outage

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112

Related articles

Trending Articles