The spectacular success of Facebook, Amazon, Apple, Netflix, Google and other digitally-enabled companies, has encouraged many businesses to seek similar agility through Digital Transformation. Many organizations are striving to become more agile - able to rapidly respond to market opportunities and demands - by converting existing, discrete, often manual, processes into digital, automated, continuous processes. The result is agility, but the unexpected price is having to manage continuous change. New strategies are needed for managing the uncertainty, and risk inherent in ever-changing systems and new tools are required that provide situational awareness, and decision support to those grappling with continuous change.
Digital transformation strategies require the conversion of existing manual, or semi-digital, processes to fully automated services that orchestrate core business activities. Digital transformation requires businesses to become proficient at software development and digital service operation. Successful digital companies draw their success from a loose collection of business and engineering practices that enable their agility. Some of these practices, and their consequences, are described below.
Agile Development Practices have flattened large organizations thereby creating many small autonomous teams that each own a portion of the overall offering. These autonomous teams can closely monitor their narrow area of responsibility and quickly respond to sudden events or emerging trends. When an organization has more than a handful of self-organizing agile teams collaborating on a single solution, it becomes more difficult to maintain a coherent strategy. Trade shows, earnings calls, and competitors’ new offerings demand a coordinated response. But, when each team is focused on different objectives and must deal with the continuous distraction of maintenance, and emergency support, predictable product management becomes a challenging task.
Cloud Deployment of solutions has allowed many companies to focus more closely on their core competencies while outsourcing less differentiated features of their offering. Instead of buying and operating hardware organizations can outsource their compute, storage and network infrastructure to IaaS providers. Instead of deploying and managing common applications, they can compose offerings from third-party PaaS managed applications like databases, message buses, and frameworks. Instead of building common services that won’t differentiate their offerings they can delegate entire business services to SaaS providers, such as billing and credit card processing. The resulting hybrid solutions deliver more, and better, features, more reliably than any one organization could develop alone. Hybrid solutions allow organizations to focus on what their customers want, not the digital table stakes needed to deliver a basic service.
Microservices are independently deployable units of business capability that can be “podded” (grouped into gangs that share the work) to increase reliability and scalability. Microservices collaborate with each other to provide features and have enabled further acceleration by modularizing development, deployment, and operation of services. Microservices are usually owned by a single team and can be upgraded independently of other deployed services thus freeing the team that owns the service to move at its own pace. By breaking up monolithic solutions into many microservices large organizations have empowered small teams to take ownership and act without the usual constraints of organizational scale. However, microservices can increase complexity making it hard to diagnose problems.
Containers standardize the deployment and orchestration of applications and lend themselves to automation. Container orchestration software can automate the deployment and upgrade of highly complex solutions. Instead of directly changing the deployment, system engineers now modify the specification and let the orchestration software take care of deploying the changes. This “Infrastructure as code” approach can massively increase consistency and reliability, thus enabling more frequent upgrades of partial or complete systems, without interruption to the service.
Open Source adoption is intrinsic to digital transformation. Most of the tools and many of IaaS, PaaS, and SaaS offerings are based on, or enabled by, open-source projects. Open source offers many advantages, more eyes looking at the source code makes for more robust, secure solutions, with less defects. Open source projects move quickly, major versions are released regularly and patches, to fix defects, are frequent. Most significantly open source projects tend not to support many prior versions. If you are on an older version the only way to get a fix is to upgrade. There is no extended support for open-source software. Adoption of open-source software, which is often a requirement for digital transformation projects, requires you to mount a never-ending, upgrade treadmill.
By bringing together, previously separate, development and operations teams and processes, into a single, unified DevOps approach the efficiency of the overall service delivery life-cycle can be vastly improved. A DevOps mindset means system reliability engineers and software developers collaborate to ensure the delivered service is continuously monitored and, if necessary, frequently upgraded to ensure it remains, at all times, within acceptable service levels. DevOps teams seek to automate the flow of software releases through the software delivery tool chain. Continuous Delivery Pipelines sequence delivery activities ensuring they are consistently and rapidly executed, steps are never missed, and software can be released in minutes vs the days, weeks, or even months of traditional manual approaches.
When agile, DevOps teams create microservices composed of open-source software and deploy them to the cloud using containers, multiple accelerators combine. By leveraging Iaas, Paas and SaaS providers teams can focus exclusively on high-value differentiators. This can result in highly agile solutions that can rapidly adapt to market demands. The agility that allows an organization to make 50 controlled releases to production on a single day is, no doubt, a potential competitive advantage. But, such speed means the production system is continually changing. Tomorrow’s systems are guaranteed to be different from today’s. Continuous daily or weekly change of this nature accumulates rapidly and creates its own problems.
Which brings us to the elephant in the room, seldom mentioned in articles on digital transformation. Staff Turnover! The average tenure of engineering staff in Silicon Valley is less than 24 months and not much better elsewhere. This means that half of most organizations’ engineering staff compliment has less than 24 months experience at the company, and 1/3rd have less than nine months.
Loyal staff must train newcomers to extend and maintain partially understood, poorly documented systems that must be continually upgraded to keep up with the latest security patches and open source releases. In such environments, it is often easier to add a new service than try to enhance an existing poorly understood one. This can lead to a proliferation of new services that replicate, or partially replicate, an ever-accumulating collection of legacy services.
Enterprise transformation initiatives create business agility by leveraging the technologies and the practices described above to enable and empower organizations to rapidly evolve their offerings in the face of fierce competition. But, the same rapid change creates legacy technology faster than ever before! The modern enterprise typically has as many services in production as it has engineers to maintain them and both the engineers and the services they maintain are rapidly changing.
Recognizing and accepting this state of affairs as the “new normal” and then turning it to your advantage is the second step of digital transformation.
Many industries already employ strategies for effectively functioning in environments characterized by continuous change, uncertainty, and risk. There is much to be learned from these examples; the investment strategies of financial institutions, clinical decision making in hospitals, mission command approaches employed by the armed forces, and risk management strategies of insurance companies all have lessons for companies embarking on digital transformation strategies. We have found three strategies, in particular, to be highly valuable when starting to grapple with these issues.
Designers, developers, and operators of solutions must understand the systems they are asked to manage. In rapidly changing environments decision-makers require good situational awareness - access to accurate, timely, structured information, that they can use to understand the situation and then draw inferences and forecast likely future states. Given good situational awareness, they can decide on an effective course of action. Instead, we often find we are overwhelmed by rapidly changing low-level disorganized data that is usually incomplete, out of date, or just wrong. Fortunately, this problem has been faced before in a wide range of environments; vehicles, financial systems, aircraft cockpits, clinical systems, and process control rooms must all provide effective situational awareness and decision support. Similar capabilities must be provided for engineers who manage and develop today’s ever-changing software systems.
Situational Awareness Level 1 Ensure engineers can observe the elements in the situation. Engineers must have access to a snapshot of the current systems, service, and technologies, that comprise the solution, and their co-dependencies. In addition, they must understand how the teams, and individuals in their organization influence and interact with these elements. Armed with such an inventory they can begin to form opinions and hypotheses.
Situational Awareness Level 2 Help decision-makers understand the situation. Knowing the static components is a start but understanding how they interact and affect each other is the next stage of situational awareness. An understanding of the dynamics of the solution, over varying timescales, is required to make sound decisions about the evolution of the system over days, weeks and months.
Situational Awareness Level 3 Provide tools so engineers can forecast the future state of the system based on current trends and planned interventions. This capability supports the creation and evaluation of alternate strategies and allows for what-if scenario planning and financial modeling.
Once engineers have a good situational awareness based on observation, understanding, and forecasting they can make informed decisions. Decision Support systems help them formulate and select strategies for specific use cases like; onboarding new employees, designing system enhancements, or planning upgrades of aging technologies. Decision support system further help execute these complex use cases in a consistent manner and create a record of planned and completed tasks for higher-level monitoring purposes.
Knowledge lives in the minds of people, while information is stored in some medium. Knowledge Management is the process of transferring information from a storage medium to the mind of a person who needs it, or capturing knowledge from one person and converting it into information so that it can be shared with others. The average employment tenure of a software engineer is currently less than 24 months. New engineers must be on-boarded and up-skilled efficiently to reduce the time it takes until they can make productive contributions. At the same time, the valuable knowledge of experts must be harvested and transformed into information so that others can learn for their experience.
Knowledge management as a strategically important business process that must be managed holistically from the first day an employee joins the organization to the day they leave. Modern agile practices have allowed far too much knowledge that should be explicit - captured as information, to remain tacit - undefined, and experientially learned. As a result, when senior staff leave organizations irreplaceable knowledge leaves with them. Successful services that have endured for more than the average tenure of an employee become harder and harder to maintain as knowledge of their design, construction, and operation gradually evaporates. The band-aid solution of providing a wiki and hoping staff will ‘do the right thing’ and document their work just leads to a chaotic, repetitive mess that is not fit for purpose. Information and knowledge must be managed and curated by dedicated staff.
Engineering Knowledge can be divided into three main categories based on the intended audience and scope of distribution. Adapting the information asset life-cycle for each type of knowledge is an essential part of a good strategy for knowledge management.
External information is targeted at third-party partners and customers. This is usually the best-managed knowledge, it is structured and formally published comprising manuals, instructional content, and training courses. This content must be kept in line with major and minor product and service releases and is usually planned along with those releases.
Inter-team information is targeted at other teams within the organization - teams that consume the deliverables of the team generating the information. In flat organizations with many teams, there are many such dependencies. Overviews, instructions on getting started, key concepts, API specifications, best practices, typical usage scenarios and use cases, common problems, FAQs, and other such documentation. The benchmark for this type of content has been set high by the many successful open source projects that software developers use on a daily bases. If a team expects their colleagues to use their solution it must be documented to at least the standard of the open-source modules and libraries they already use. This area is the worst defined in most enterprise organizations today, and yet with the flattening of organizations and adoption of micro-services, this area is the fastest-growing.
Intra-team knowledge is the team’s internal operational knowledge, it includes, day-to-day, plans and road-maps, lists of defects, and tasks, This knowledge is the most volatile and is often kept tacit or can be inferred from well-written code and a few design documents. There are many specialized solutions for this type of information in use today from wikis to task management systems.
A balanced portfolio is key to a good investment strategy, it spreads risk and allows for periodic readjustments to maintain balance. Both investment and insurance companies balance their portfolios to manage risk. A similar approach can be used for managing a tech stack in a changing environment. Like investments in a portfolio, you should re-balance your tech stack on a regular basis. Unlike investments the cost of switching technologies can vary significantly - some technologies are much harder to swap out than others. The tendency of some technologies to create lock-in should drive your selection strategy.
Organizations should place long term core technology bets conservatively. Before making such a choice consider the maturity of the technology, the size of the ecosystem it supports, how easy it is to hire qualified people, and the stability of the organization that stands behind the technology. Organizations should embrace open source and plan to upgrade continuously. Open source is a good path to high-quality feature-rich code, but it is not free, the cost is continuous upgrades. Establish a regular cadence for managing the life-cycle of technologies, invest in keeping them up to date. If you don’t your troubles will multiply and your options for fixing them will diminish.
Organizations should provide support for teams using preferred technologies, but tolerate technical diversity. They should encourage the internal adoption of recommended technologies by providing support and training but not mandate the use of these technologies. Organizations should allow teams to deviate from the recommended approach, but make each team accountable for the cost of going it alone. This is how engineering organizations learn. Accept that some of these experiments will fail and set hard limits on how long experiments can run before showing a return. Invest in helping teams migrate off failed technologies.
Digital Transformation is the beginning of a journey that leads to the next challenge - mastering continuous change. As companies embark on digital transformations they should look further down the road at the current challenges of companies who are already fully digital. Digital transformation is not a destination it is an enabler for business agility. The price of business agility is continuous change, but, with the correct strategy and tooling, continuous change can be harnessed and turned to a competitive advantage.