A few months back, I took the outrageous (to some) position of comparing the public cloud to a black hole. I claimed it was like a black hole because it would suck all the IT capabilities and expertise out of your organization, and eventually suck in your business as well. Or perhaps, like Marc Andreessen, I should say “eat”, as in: “software is eating entire industries.”
Amazon has done an amazing job of promoting their ancillary business, AWS, as the “public cloud”. This is astounding because it is only public in the sense that anyone can enter if they pay the price of admission. By this definition, Disneyland™ would be a public park.
Actually, the primary difference between Disneyland and AWS is that Disneyland is far easier to leave. AWS has been furiously deploying “services” that seduce you and your developers into using it. It is easy to check in but almost impossible to leave.
Well, this rant was two months ago, and I don’t think you want to hear it again. Or if you do, you can re-read my December blog, “Public Cloud as a Black Hole: There are Choices before the Event Horizon“; I wouldn’t change a word.
However, I am back because AWS just gave me another great opportunity to revisit my black hole analogy. On Tuesday, February 28, 2017, large portions of AWS went down, or should I say “black”. As a consequence, literally thousands of web sites went black, as their content disappeared into the black hole that S3 had become. In some cases, it was only parts of web sites, in others the service just became very slow, while others were just “gone” — disappeared over the event horizon. I don’t know if the rest of the world will call this “black Tuesday,” but I will for now.
But wait! The cloud is not supposed to be like that. It is just supposed to be floating along somewhere, nice and white and fluffy, taking care of our computing needs, tended to by meticulous Bezos minions who know exactly how to keep it floating. Returning to the comparison to Disneyland, this park is Fantasy Land.
This story is another triumph of marketing. Let’s get real. Years ago, cloud used to have a bunch of foggy meanings in computing. “Your head is in the clouds” or “cloudy reasoning”. Or, “marketing is where the rubber meets the clouds.” What is this “public cloud” (oops, I mean Jeff Bezos’s Cloud), in reality?
In reality, the “public cloud” is a large complex set of data centers, each hosting thousands of servers and complex networking, growing rapidly to suck up as much business as possible. After all, that’s the strategy — get big fast. I don’t blame them for this one bit. Yet, rapid growth always ups the risk of instability. New systems have to be installed and configured, and connected to existing networks and systems, all running the risk of incorrect connection and misconfiguration. And, new operators need to be trained and deployed and learn from “experience,” i.e. making mistakes.
I value old sayings because, if they are old and we know them, there is a lot of truth in them that keeps them alive. How about the one: “The bigger they are, the harder they fall.” AWS is the biggest cloud by far. Or, as my mother would say, “Don’t put all your eggs in one basket”. This was not qualified by some phrase like “unless the basket is managed in the cloud” or “unless it is a public basket”.
I was once quoted as saying “the only reason all the computers in the world haven’t crashed at the same time is because they aren’t all connected yet.” (I’m not taking it back). We are working on solving that connectivity problem with the Internet. The cloud makes them strongly connected.
But perhaps more prophetic for “Black Tuesday” is a comment by famed computer scientist Leslie Lamport who said, “Cloud computing is having a computer you never heard of bringing your work to a halt.” Actually, he said this about distributed computing, but that was before the “cloud” term became the word du jour. Nonetheless, are you willing to have your work come to a halt because of some computer or person completely outside of your organization? Apparently, almost 148,000 sites got “lamported” on Black Tuesday.
I can’t resist pointing out another analogy between AWS and a black hole. You can’t see a black hole because light can’t escape from it. That is, it is not only dark but keeps you in the dark, so to speak. With AWS drinking their own Kool-aid, the AWS dashboard is apparently hosted on S3. So when S3 was down, it was still showing as being up. You have to laugh when the http://isitdownrightnow.com page is down — except when the rest of your business is also down. So, don’t count on the cloud telling you when it is raining. You just know it when your web presence and enterprise productivity gets soaked.
Now, everyone can make mistakes and everyone can have hardware failures. We can’t expect, and don’t expect, the AWS folks to be perfect. However, how much better are they really than what you can accomplish in your private cloud? Here, I think the rapid growth of AWS is working against them. What is the average number of years of experience of operators at AWS if they are doubling as rapidly as they claim? I think the average has to be going down because they are hiring faster than experienced operators can be produced, which takes years. Moreover, rapid growth means instability. You don’t need the growth, but AWS shareholders do, so you are being exposed to instability to your detriment, to benefit them.
And, it is well-known that most failures are due to operator error. Let me pay homage to a hero of mine, the beloved Jim Gray who wrote 30 years ago: “We can’t hope for better people. The only hope is to simplify and reduce human intervention in these aspects of the system.” We humans just make mistakes in dealing with these complex systems, especially when everything is changing around you, like at AWS. We can only reduce human intervention with automation.