Monday, June 20, 2016

How Not To Build A Cloud

Thomas Bitman of Gartner wrote a blog post last year about why OpenStack projects fail. In that article, he outlined three particular metrics which together cause 60% of OpenStack projects to fall short of expectations:
  • Wrong people (31% of failures): a successful cloud needs commitment both from the operations team as well as from “anchor” tenants.
  • Wrong processes (19% of failures): a successful cloud automates across silos in the software development lifecycle, not just within silos.
  • Wrong metrics (10% of failures): a successful cloud focuses on top line transformation by accelerating delivery of innovative applications and services, not merely on squeezing bottom line costs. 


Wrong people
"Agile clouds need agile processes — and people are your biggest supporters, 
or your biggest roadblocks.” - Thomas Bitman
Many OpenStack projects start as technology pilots with part time technical staff. If there is not a single champion responsible for the success of an OpenStack cloud initiative as their full-time job, the chance of failure is high. There are two critical roles that govern cloud success:
  Cloud operations champion: this champion is not just responsible for building and operating the cloud (supplying cloud capacity), they are equally responsible for on-boarding developers and workloads onto the cloud (building cloud demand). Their job is to work closely with developer tenants to make sure that the developer on boarding process is smooth and that key developer tools are available in the cloud application catalog.
   Cloud anchor tenant: developers are overwhelmingly the most important early adopters of private cloud. Accelerating the software development lifecycle through DevOps automation is by far the highest value of private cloud. Therefore the most important validation for a private cloud is to on-board a key set of developers and show the impact of accelerating the development and go live process for their applications. Having an anchor tenant committed to using the cloud is a key prerequisite for achieving success.

Wrong processes
"Is this really cloud? Or just virtualization? And what about 
the stuff running inside the VMs?” - Thomas Bitman
Many OpenStack projects start with very limited goals around provisioning generic VMs or delivering relatively limited development services. This effectively automates just a silo within the software development lifecycle. Business value comes from being able to automate not just within but also across the silos of the software development lifecycle.
  Beware of automating silos: for many IT organizations, the tragedy of virtualization has been that developers can provision a VM within 20 minutes, but getting a fully configured development environment takes over 6 weeks. 
   Aim to automate entire Go Live process: The ultimate goal of a private cloud should be to accelerate the delivery of applications and features by automating the entire process from code check in to go live. This level of automation is also the only way a traditional enterprise can compete with “born in the cloud” SaaS businesses.

Wrong metrics
“Not putting the right metrics in place - usually, this is focusing 
on cost-savings, not agility." - Thomas Bitman

Private cloud has often been sold as a natural extension of virtualization - as such, customers often justified their OpenStack investments based on IT cost savings. While cost savings are one value of a successful cloud, enabling business agility is the core value delivered by OpenStack.

OpenStack projects should measure business value not just for the cloud overall but for each tenant. In particular, they should focus on two tenant metrics:
      Uptime dashboard: public clouds have long delivered detailed uptime metrics. Private clouds must do the same if they are to build trust with tenants and create a business case to justify additional cloud investments.
      Value dashboard: private cloud value is primarily driven by its ability to accelerate the software development lifecycle. McKinsey has documented that DevOps automation can accelerate the go live process by 80%, which in turn can deliver top line revenue growth, for example by enabling greater innovation in customer facing apps. Tracking continuous integration deployments is a proxy for the overall acceleration enabled by private cloud.

Planning for OpenStack Success
The antidote for OpenStack project failure is to build a business case for private cloud that addresses people, process and metric issues. This business case should lay out a phased approach for rolling out their private cloud.

The starting point is identifying a full time cloud champion and teaming them with an anchor tenant who will use the cloud and provide input on how to deliver value by accelerating delivery of new applications and features. The next step is to define a phased set of investments, each with clear success metrics that govern timing for subsequent investment:
  Phase 1: stand up cloud and on-board anchor tenant. Success metric: 99% uptime, 1.5X software development acceleration. Once these metrics are achieved, the company should invest in phase 2 of their rollout.
   Phase 2: on-board additional tenants. Success metric: 99.9% uptime, 2.0X software development acceleration.
   Phase 3: automate go live process from code checkin to production. Success metric: 99.99% uptime, 4.0X software development acceleration.


An ideal approach for a company looking to make a strategic investment in private cloud is to conduct a short pilot in an OpenStack lab that allows them to validate the business case. This kind of a pilot can also allow the cloud champion and “anchor” tenant to work together on clarifying requirements for successfully on-boarding an initial application to the private cloud.