I’d like to issue an apology.
I’m sorry Cloud is so confusing.
Yes. There. I said it.
The problem we have is twofold: Cloud covers an enormous part of the computing stack and because it is the movement du jour, almost everything either is Cloud or is being rebranded Cloud. We call this Cloud-washing.
So- a very diverse set of technologies and lots of marketing spin- a recipe for disaster in anyone’s book.
It’s not surprising therefore over the past days and weeks lots of people have reached out to me to ask about the recent downtime on Amazon and Playstation Network (PSN) being hacked asking: Is Cloud really ready for primetime? This sort of fear and uncertainty is exacerbated by Tim Weber’s inaccurate reporting today on the BBC.
Cloud computing may be the hottest thing in corporate computing right now, but two IT disasters – at Amazon and Sony – beg the question: Is cloud computing ready for primetime business?
This demonstrates the real problem: Being so confusing that some IT professionals don’t understand it- how are the media to know any different? So much so they’re struggling to tell the difference between “On-line” and “Cloud”. And because they don’t understand it- they’re misinterpreting it… I couldn’t resist retweeting David Pogue from the NY Times and CNBC the other day. Why does it matter?
Because everything Online is not Cloud.
And calling everything Cloud when it’s not muddies the water and makes things worse, not better.
So how do you tell the difference? Start with reverting to a position that a website or web service isn’t Cloud by default, but it might be given the right set of circumstances. When does a web service become Cloud? Typically as SaaS- Software as a Service. JP Morgenthal on Quora had a good go at answering this:
SaaS implies you are acquiring your software as a service delivery model, and with that, most likely paying for that service. How you interact with that software is undefined, but most likely Web-based. A consumer-oriented Web Service implies nothing more than a publicly published interface based on REST or SOAP approaches for interacting with a Web-based application.
For consumers if it’s not SaaS, then it isn’t Cloud. Just like Sony’s PSN isn’t Cloud- it’s a web service. Look how they describe it on their website:
PlayStation Network is a free-to-access* interactive environment where you can play online games, chat to friends and family around the world and surf the web – and all for free. It doesn’t cost a thing to get connected, so sign up to PlayStation Network and get stuck in today.
That isn’t SaaS so isn’t Cloud. So Sony’s “Cloud” issues aren’t Cloud issues at all, they’re network security issues with their web service and should be described as such. Plus I don’t see how a consumer gaming network is going to effect this scenario quoted in the BBC article:
It’s a nightmare moment. You are under pressure – to meet customer orders, finish a project, execute a deal – and nothing. Your computers, servers or network are down. If you are lucky, a few nail biting hours and a reboot or three later, you and your IT team have restored services.
Amazon’s issues are entirely different- they relate to Infrastructure as a Service (IaaS), a different part of the Cloud Stack.
For those of you that don’t know what IaaS is- it’s disk, network and compute bought on an on-demand basis, normally in the form of virtualisation.
The BBC gleefully showed this image of Amazon’s status page as evidence of it’s unreliability, failing to notice all the other zones being up.
The reality behind the Amazon EC2 outage is their Elastic Block Store (EBS) failed in two availability zones in one location due to human error. They revealed a weakness in EBS that wasn’t known previously but an issue known to Cloud experts.
The websites that went down because they only had their data stored in that one location, and not the more expensive option of cross locations. In essence they failed because they trusted Amazon and wanted to do it as cheaply as possible. Had they installed services locally this would be akin to installing on a single machine with single failure point tolerances. This highlights an important lesson for those going to IaaS. Applications built for IaaS are completely different to those built for on-premise, because they conform to a highly distributed “design for failure” architecture, which gives them the ability to scale up and down, as well as the ability to recover from infrastructure failure.
There were lots of companies that took this approach, and using Amazon didn’t have any downtime at all. Most notable of these was Netflix- which now generates some 29.4% of peak time internet traffic in the US. They have a true “design for failure” approach and have built a “chaos monkey” to randomly cause havoc on their systems by shutting down systems to see how the application responds. They’ve found their weaknesses and fixed them, and as a result survived catastrophic location failures despite being reliant on Amazon.
So with IaaS, it’s not the Cloud that matters- it’s how your application is built.
And finally- but most importantly to me- I don’t want these failures to be associated with SaaS. We go to enormous lengths to ensure that our services are reliable and available to our clients, backed up by a 100% availability SLA. In essence, we put our money where our mouth is.
And yes, “Cloud” or whatever part of the stack you’re talking about, is ready for primetime. It’s what and how you use it that matters.