(Cross-posted @ The Loose Couple's Blog)
Cloud Complexity ? It’s A Wrench.
Posted by Christian Reilly // January 10th, 2012 // Uncategorized
A new year, a old topic. Complexity. “CLOUD IS COMPLEX” screamed the headline of two recent blog posts from my Clouderati alumni, James Urquhart and Sam Johnston. Really ? Who would have thought ? There really is nothing that gets past these two guys, is there ? Joking aside, their respective copy brings a sharp focus on a topic that has, in my opinion, two very different sets of meanings and two very different sets of challenges, depending upon which side of the proverbial cloudfence one is resting one’s posterior upon. If, dear reader, you are a vocal proponent of public cloud, renowned and famed to, upon occasion, theatrically wave your arms around while openly condemning the very notion of private cloud to be evil and reprehensible, then I would metaphorically place you in the “don’t know, don’t care” bucket when it comes to understanding how (to quote Sam Johnston) “the delivery of information technology as a service rather than a product” is brought to your browser and your credit card, respectively. Hmm, the classic “power grid” analogy – just plug it in and it works. Nothing wrong with that. Not at all. If, however, like this author, you are somewhat willing to entertain the concept of private cloud, either through experience or hope, or even as a much needed and logical evolution of the galling monotony and crippling legacy of today’s large enterprise IT environments, then I might suggest that the complexity thrust upon you via veritable pot-pourri of technologies, services operating models and organizational challenges will place you, either wittingly or unwittingly, somewhere between Levels 1 and 2 of the Conscious Competence Ladder. I’ve never been a particularly big fan of the “cloud / utility computing is the same as the move away from building your own power station to the public grid” analogy. It’s fine for an incredibly basic mental picture of the difference between having a substation located at the bottom of your garden versus a medium voltage cable and a meter connected to the local provider, but as far as the depth of the analogy’s relevance to the practical application of any cloud strategy goes, it would be easier to say “someone else provides the capacity”. Job done. Same result. Not much use at all. The major flaw in the analogy is that in today’s rapidly changing enterprise IT world, where virtualization has only just begun to take hold (arguably) but is widely accepted as being a cornerstone of any cloud, it sadly isn’t as simple as just sticking a fat pipe into a service provider and letting someone else provide the capacity. If it was, then everyone would do it. Imagine if every organization since time immaterial had asked the electricity provider to take over running the rest of the machinery, plant or robotic equipment that it’s invisible juice powered ? Well, to me, that’s the crux of the application of cloud infrastructure. Hardly apples to apples. There is complexity in public cloud, there is complexity in private cloud – it’s simply a case of who owns and manages the complexity and how much appetite you have for running your services on each – but equally as service providers are doing a better and better job at managing “simplexity”, most enterprises continue to wrestle with their strategies, egged on by a myriad of vendors who now have the word “cloud” in every piece of marketing literature. It’s not a one size fits all model, there is no either / or. In my incredibly humble opinion, it is increasingly arguable that the case for private clouds is stronger than ever, yet, as the struggle to keep up to date with technology trends and models gains momentum, I don’t see any sign of the landscape becoming simpler to design, implement or operate. In fact, I think many enterprises, in their best efforts to implement all that they are told they can’t do without, are heading for more complexity than they ever dreamed possible - creating an environment so complex, that even Rube Goldberg might raise a mechanical eyebrow. Physical servers, virtual servers, physical switches, virtual switches, physical interfaces, virtual interfaces, physical storage, virtual storage, physical load balancers, virtual load balancers, physical firewalls, virtual firewalls, physical networks (?), virtual networks, physical interfaces, virtual interfaces, physical IP address (?), virtual IP address, physical data centers, virtual data centers, physical people, virtual people, Mechanical Turks. Mechanical what ? Mechanical Turks. I thought that’s what you said. And so, it goes on and on, something like this. “IT ? Where’s my server ? Oh where did it go ?” (We are all losing money and patience is low) “I want it to work, you must call the Turk.” (We should call the Turk, he’ll get it to work) “We have something to tell you, the Turk isn’t real.” (The Turk isn’t real ? That’s quite a big deal.) “The server is somewhere, it just can’t have gone.” (We just need to find out which storage it’s on) “The CMDB ! Now this one’s in the bag.” (But the CMBD has just waved a white flag) “It can’t be the DevOps, it can’t be those guys !” (The DevOps are admins ? That’s quite a surprise) “So it must be the network, it’s eaten my app.” (As the Net guys will tell you, that’s monstrous crap) “We’ve found it, don’t worry, we’ll just bring it back.” (Now several young admins are facing the sack) “That downtime has killed us, we’ve lost fifty grand.” (..as the CEO enters still waving his hand) “It’s much more efficient !” IT screams out loud. (But the moral is simply “shit happens in cloud”) Today, there isn’t a CMBD tool on earth (yet) that can realistically and efficiently keep pace with the inherent fluidity, agility and flexibility of even the most well intended cloud deployments and the ditty above is a not-so-tongue-in-cheek example of what happens when complexity is mixed with a lack of clear visibility. Interestingly, this problem isn’t unique to technology, nor cloud. In my every day life, I come across a E&C (Engineering & Construction) industry wide problem that relates to a concept called “wrench time”. Typically, wrench time is a measure of crafts personnel at work, using tools, in front of jobs. Wrench time does not include obtaining parts, tools or instructions, or the travel associated with those tasks. In some cases, wrench time can be as low as 20% of a total working week, meaning roughly 8 hours spent fixing problems with the remaining 32 hours spent with non-value-added tasks including finding and qualifying maintenance record information. The parallels are obvious. Difficulty in finding and qualifying information, in and amongst these complex systems – clouds or power stations – leads to inefficient maintenance, poor RTO times and eventually to revenue or reputation loss. Spanner in the works, anyone ?