Innovating with the (Azure) Public Cloud, and how to set ourselves up for success.

David Lee
12 min readNov 7, 2020

Building a brand-new product is never an easy task, especially in this ever-changing environment. Imagine that you are now responsible for a green field project and you have a mandate to develop solutions that can help your organization deliver the next generation of exciting and innovative product. The possibilities of how to turn this project into a success story is endless. There is of course the age-old limitation of budget, time, and resource, and you must make the best of it.

Whether your organization is an established enterprise, or a new startup, the ability to innovate quickly is a key ingredient to the success of your project. To be clear, I am not referring to the ability to create a solution quickly. The outcome of an exercise in innovation is not necessarily success, but failure. Edison and his team of researchers had thousands of failures before the light bulb was invented. Many unknows exist in a green field project, and the objective in the early days is often to prove that it is possible for a solution to work and deliver the expected results. You do not have weeks, but maybe a couple of days as you are moving forward on each step of the way. Every moment counts. Therefore, failing fast is key as you innovate.

During an exercise in innovation, you could be incrementally validating a series of questions about a solution. This means you are taking small risks, learning a bit about the solution along the way. Therefore, you will need to have the ability to drop and restart from certain steps you have validated. It is also possible you will need to start from the beginning when it becomes clear that the solution is not viable, either because it did not work, or it did not deliver the expected result.

The mechanics of innovation means that you do not have the luxury of knowing what you need for the long term, given the fact that anything you are developing may be thrown away at the next moment as the solution is not delivering the expected result. From a capital cost expenditure perspective of buying hardware or software, you will be hard pressed to sink investments into something you do not know will be long term or not. In fact, you will have to consider that any hardware or software would have to be maintained and serviced by an infrastructure team. If it involves an unfamiliar technology, there would be additional training cost on top of your investments by the Infrastructure team. An example would be if your Development team decides on a NO SQL technology, and your Infrastructure team do not already have in house expertise, this knowledge will have to be supplemented.

This is where the Public Cloud, because of its capabilities on dynamic resource consumption, comes into the picture. Many organizations have begun to, or have already considered, as part of their internal offerings to internal teams, and this is evident by increasing market size of the Public Clouds. Before going into the further, let us agree on the definition of what a Public Cloud means, because we all may have heard of many definitions out there on Public Cloud or Cloud.

The National Institute of Standards and Technology’s (NIST) defines five characteristics of cloud computing: on-demand self-service, broad network access, resource pooling, rapid elasticity or expansion, and measured service. NIST also list three service models: Infrastructure-as-a-service (IaaS), Platform-as-a-service (PaaS) and Software-as-a-service (SaaS). Lastly, NIST list 4 deployment models, Public, Private, Community and Hybrid. Here, we are focused on the Public Cloud, where according to NIST, is infrastructure provisioned for open use by the public and exists on the premises of the cloud provider. The reason for focusing on Public Cloud is that, comparatively, this model provides the least amount of management concerns of hardware and software. This is not to say innovation cannot exists for other service models but Public Cloud is where we can achieve the highest impact.

Based on the characteristics of Cloud provided by NIST, we can find capabilities that are key for innovation. The capability to provide self-service provisioning of services means no one must talk to the Infrastructure team to stand up a new services, which shortens the time to learn how to install, and setup a service. It also means we can experiment with a new technology for a solution quickly. The capability of broad network access, resource pooling and elasticity means we can quickly scale up and down our solution and run tests related to performance quickly. When the service is measured, it means we have the ability to pay for exactly what we use. Because there are metrics collected, it also means we do not have the worry about how to measure whether services meet our requirements or make adjustments to scale. All these capabilities are uniquely suited for rapid innovation. The ability to innovate without fear on the Public Clouds is a game changer because any risks you are taking is minimized. You can fail fast and iterate quickly to find the right solution.

Now that we understand how the Public Cloud can help us innovate, we will need to consider what the best practices are, and what pitfalls to avoid, which can help your project lay a solid foundation for innovation to thrive on. I will be providing examples in Azure because it is the Public Cloud that I have worked on most throughout my career. However, the ideas and principals can still be applied on any Public Cloud.

Automate your environments so you do not lose innovation momentum.

Any work that involves manual intervention by a human being, no matter how careful, will lead to a mistake at some point. You cannot afford to lose time due to a human mistake. Therefore, it becomes important for you to be able to automate the management of your environments. This means that you can consistently reproduce the exact environment in a deterministic manner, and you are guaranteed that all the necessary services are provisioned each time, thus ensuring minimal loss of time in any innovation work that needs to be replicated.

Azure Resource Manager (ARM) is one such tool that can be used to help you create the necessary resources in your subscription in an automated fashion. You can write an ARM Template which defines the resources needed and utilize robust scripting tools such as PowerShell and Bash to execute it. The best way would be to integrate your ARM Template into Azure DevOps which can help execute the ARM Template via Azure Pipelines. Using Azure Pipeline means you can even extend the creation and update of environments by triggers such as scheduled or even on check-in. Azure DevOps has tight integration with Azure in a secure way. This means you can create secure service connections to different Azure Subscriptions for deployment. For example, if you have a different environment per subscription, you can utilize this to create and maintain different environments.

An alternative to ARM Template would be Terraform by HashiCorp. Instead of writing ARM Template, you would be writing in HashiCorp Configuration Language (HCL). The advantage would be having less lines of code to write. There is also an integration of Terraform in Azure DevOps via an extension in the Market place.

Not everything needs to be automated. There will be an expectation of what is a reasonable use of the valuable time of the team. The rule of thumb here is whether the work is repeated more than once. For example, it could be important for the team to have different types of Virtual Machines created for testing different workloads. In Azure, we may consider creating an ARM Template for creating Virtual Machines, with inputs from the requestor. There will be common resources configured in the ARM Template to use existing Virtual Network, and Subnet which already has a Network Security Group applied. In the ARM Template, the Virtual Machine can be configured to only be accessed via Azure Bastion. The ARM Template will take in parameters such as the size and type of the Virtual Machine from the requestor. However, in a different case, we might be creating shared services such as Azure SQL Server or Virtual Network only once. It would be a simple effort to simply create these services using the Azure portal which can be accomplish within minutes.

Ensure the Public Cloud Reference Architecture is part of your innovation process.

As we look to create a solution, very often, we get to choose from an array of services that we can stitch together. Before we begin, we should consider what Public Cloud vendors have for architectural patterns for common scenarios. The Azure Architecture Center is a good location for reference architecture.

One important thing to note here is that the reference architectures serve as a guide, rather than the final solution. You will need to understand the limitations of the services proposed in the reference architecture. One example might be that you have a requirement to ingest JSON documents in near-real-time and according to the reference architecture, Azure Cosmos DB would be a perfect ingestion data store because of its reliability and near-real-time capabilities. However, there is also a cost to this reliability because there is a limit on the maximum document size that you may not have considered. As time of this writing, the maximum document size for Cosmos DB is 2 MB. This does not mean you should not consider the use of Cosmos DB. It means you need to consider it as part of your solution. Your solution might comprise of supplemental services where it you can fork large documents into a different path and smaller size documents could still be handled with Cosmos DB.

Mileage will vary based on your business requirements and use cases as you refer to the reference architecture. One size does not necessarily fit all and that in some cases, you are trailblazing a brand-new path. There will be unknowns and uncertainties that you would have to consider, account for, and adjust as you move along on the project and innovate. However, time is of essence, and any existing architecture reference will help move us along in our innovation efforts.

Establishing guard rails with governance with support from your organization for a effective innovation environment.

Innovation on the Public Cloud is not without responsibilities from you or your organization. Any organization that is starting on their cloud journey should have controls and governance to manage expectations, risks, compliance, and costs. In other words, you need to have guard rails established by the organization on an enterprise level for your project to have effective environments for innovation.

From an enterprise level, Public Clouds vendors are already thought about guard rails and have the necessary tools to help organizations today. For example, you may require all resources must only be created in a few selected regions .Azure already has the concept of Azure Policy on the Management Level. These tools enable the organizations to build out and implement the necessary policies on an enterprise level that the projects can rely on.

A caveat here for the organization to consider is that policies can be too restrictive, resulting in ineffective conditions for innovation to thrive in. For organizations that are beginning on their Cloud journey, strict control measures can be especially true as the organization tries for figure out the way forward. As the organization gets more mature, it would become clear for what works and what does not. It is important to have a process to periodically evaluate and adjust internal controls to improve the guard rails, so that the organization and the project can enjoy the true benefits of what the Public Cloud offers. For example, the CloudShell was previously limited to a few regions but your policy may not allow for resources to be created in those regions.

Create a budget on the Subscription and Resource Group for a cost effective outcome of your innovation process.

The Public Cloud is consumption based, and you pay only for what you use. Knowing what you are paying for and knowing what you are not paying for would be key for your project budget. Every service you use has a pricing page that gives you an overview of the cost. You can charge by the hour or even seconds in some cases. You could be charged on compute (execution counts), memory, data size ingress or egress or other metrics. Some services offer a free tier. However, you may find that there is not going to be a precise amount you can calculate. Usage patterns can change unless you have a guaranteed fixed workload. Maybe there is an expected spike that you did not anticipate. On an Azure Subscription or a Resource Group, you can set budgets and raise alerts if a percentage of your expected budget is breached. You can make use of a Resource Group to define a specific solution you are testing and be able to set a budget for it. Having a cost specific guard rail can minimize the risk that your budget suffers from an unexpected bill which may derail your project. You will also want to consider creating a dashboard that can help you understand your consumption over time. With automation scripts, it is possible to export your monthly spend into any visualization tool such as Power BI so that you can understand your costs over time and adjust for your budget.

Establishing a Tech Radar to ensure continuous learning for innovation to thrive on.

On an enterprise level, it is a good practice for projects to have an avenue to share knowledge around services. By providing different levels of opinionated guidance to a service, different projects on various innovation paths can learn from each other and avoid some of the pitfalls as well as some of the successes. The use of Tech Rader by the enterprise to provide guidance would help us pause and deliberated on the merits of a service for the solution we are building. Having this information before we invest our valuable time during innovation would allow us to have good, measurable conversations and seek out the best path for our solution. The tool by ThoughtWorks would be a good reference on how to create your own Tech Radar.

Establish security governance and controls for a secure innovation process.

What about security on the Public Cloud? The fruits of your innovation may be compromise when either data, and/or intellectual properties are being exposed unknowingly. This could have disastrous impact on your project. One fundamental infrastructure that the Public Cloud vendors have in common is that they have a comprehensive set of Identity Access Management (IAM) controls built-in, and in a lot of cases, it is based on the one or more roles (with built-in privileges) granted to a user, a service, or service principal. This gives you a powerful way to implement least privileged access for your organization. It is therefore key for your organization on an enterprise level to be ready to review, and provide necessary guidance, tools and workflows related to securing various services and assets.

From a network and data security perspective, there are various ways to secure data and data access such as via encrypted keys which can be managed or controlled exclusively by your organization. A good example if that you can enable disk encryption in Azure and have the keys be stored securely in Azure Key Vault which provides a layer of access controls built on top of it.

Azure Security Center provides a good overview, recommendations and scoring of your workloads so that you can strengthen your security posture with best practices. In other words, just like an organization will take active steps to ensure security within its own enterprise networks, the organization will need to continue to take similar steps on the Public Cloud. Whether your innovation happens on premise, or on the cloud, security controls does not get diminished in any way. With Azure Security Center, your project can rely on the support from the best experts in cloud security to give you the protections you need for your innovations, provided you actively design, plan for, and execute upon the recommended practices.

Put data on the Public Cloud with PaaS services so you can easily evolve your solution for continuous innovation.

Solutions will evolve. Over time, requirements may change, different, more effective solutions may present themselves or usage patterns may emerge. With data on the Public Cloud, it makes possible to evolve your solution easily and quickly. With PaaS services, you now have endless possibilities to replicate data and discover new ways to shape the data without impacting the performance of your existing production workloads. In Azure SQL, there are many ways to replicate your SQL database as well as get point in time copies of your SQL database when you have a question after data has changed. Another noteworthy service is Azure Synapse which allows you to connect different data pieces from various services and run analytics at massive scale. Comparatively, if you are hosting your data in Virtual Machines running SQL, you can still make use of internal tooling to do the same, but that requires more time and effort instead of using the built-in capabilities of PaaS services.

Consider services with built-in High Availability (HA) and Disaster Recovery (DR) capability to maximize reliability of your solution during the innovation.

Questions around HA and DR will prop up at some point of the project. The answers may lead to unexpected cost, depending on the technology stack you choose. In Azure, a lot of the services would have already considered this as part of their offerings. For example, with Azure Storage, there have been many built-in replication options, such as Locally Redundant Storage, Zone Redundant Storage, or Geographic Redundant Storage. Azure Cosmos DB offers replication to a different region based on the click of a button. This means when you make a choice of the service or services in your solution, you know you already have answers to being able to effectively maintain your solution. You can effectively focus your resources on true innovation work, so it is important for your innovation process to include spending time referencing the documentation that the Public Cloud has, to investigate these important questions.

Take away

The Public Cloud offers capabilities that are well suited for innovation and by understanding and following best practices, we will be setting ourselves up for success and turning the project into a success story.

--

--

David Lee

Cloud Solution Architect/Software/DevOps Engineer