You're probably no stranger to the pain of unexpected cloud bills. You know the feeling - you're cruising along, minding your own business, when suddenly you get hit with a bill that makes your heart stop. Too often, finance teams are left scrambling to explain variances to an executive’s budget when out of control cloud spend occurs.
Cloud cost optimization is the topic du jour these days and the executive pressure is on to reign in these costs but with a lack of ownership and increasing amounts of complexity, cost can quickly spin out of control. It's no wonder that finance teams feel like they're stuck in a never-ending cycle of forecasting, guessing, and hoping for the best.
But what if I told you that there was a better way? What if I told you that you could improve your cloud cost visibility and take control of your cloud costs?
I’ll share tips for what finance needs to include in a cloud cost forecast. It's time to stop feeling like your forecast is at the mercy of out of control cloud costs. Let's dive in!
First, I want to address the obvious but critical: Actuals. Nothing warms the heart of a financial analyst more than when the actuals tie out in a financial model. The actuals serve as the foundation in which all good financial models are built.
For cloud costs, actual reporting can be found in the reporting offered by the large cloud vendors but is often insufficient for the needs of a finance team. Cloud costs present a few unique challenges around gathering actuals for finance.
Purchased direct or via reseller: Cloud costs can be procured directly with the vendor or through a reseller. Resellers often offer value added services such as support, professional services, and discount programs which are separate from the traditional reporting offered by cloud vendors directly.
Invoice date vs. usage date: Accounting tends to look at the invoice date to record the expense in the GL, whereas engineering operations tends to look at the actual usage date as reported in the cloud provider’s reporting solution. Depending on who you ask and when you ask them, the answer “how much did we spend in cloud costs” will be different.
Discounts, credits, market development funds (MDF): Cloud vendors have various incentives they use to attract customers to their platform. These incentives can be tracked at various levels of an account (for example, payer level or project level), so in order to get an apples to apples comparison, it is important to employ a strategy to allocate these incentives consistently.
Allocations, showbacks, chargebacks: Companies employ these tactics in order to gain a “true cost of ownership” and to bring cost accountability to teams for their cloud usage.
So what's a finance team to do? The key is to base your forecast on the data that will inform you when things go wrong (and trust me, it will go wrong), and provide the information needed to help you navigate through future risks and opportunities.
Here are the buckets that need to be in your cloud cost forecast:
Vendors
Services
Environments
Products/Teams
Initiatives
Vendors
What is it? Vendors are providers of cloud services that underpin SaaS applications. As companies grow, they begin to develop a multi-cloud strategy and develop increasingly complex tech stacks causing the vendor list to explode. In order to calm the chaos, I tend to group cloud vendors into 2 buckets: Infrastructure and SaaS
Infrastructure vendors: Provide infrastructure in which SaaS applications run. These are the large public cloud providers such as AWS, GCP, and Azure.
SaaS vendors: Provide core services that help businesses run their end-user applications effectively, efficiently, and securely. Large public cloud vendors offer competing solutions in this bucket but other companies include Hashicorp, MongoDB, Datadog, Cloudflare, Github and Harness.
Why include it? Understanding spend by vendor allows finance teams to estimate their future spend with a vendor and leverage that information in contract negotiations. These contract commitments are material (enough so that they are often reported in an S1 or 10k) so better visibility leads to higher optimization.
Services
What is it? These are the core cloud infrastructure services that run an application. Depending on the types of activities an application is performing, the number of services can be large and the mix of services will be quite different. Infrastructure vendors in particular offer hundreds of sku’s within services which complicates any analysis of cost trends. Here are the high level buckets in which I tend to view these services:
Compute: These services provide virtual machines, containers, and server-less environments for running applications and workloads.
Storage and Databases: These services offer various storage solutions, including block, file and object storage depending on the system architecture. It also includes a variety of database applications, including relational and NoSQL databases.
Networking: These services help you set up, manage, and secure your networks and connectivity in the cloud.
Other: Examples include big data, analytics, AI/ML, security and identity, management and developer tools. If spend in one of these categories is significant enough (>10%) then it is likely worth the effort to create its own category. I try to keep the total services grouping to less than 5 categories with the other bucket being less than 10% of overall cloud spend.
Why include it? The underlying services that make up an application provide insights into the architecture of an application. Including these services empowers finance teams to accurately predict, optimize, and control cloud-related expenses.
Environments
What is it? Environments are the groupings in which the development process is broken up.
Production: These are the environments that contribute to your cost of sales as these are directly related to customer usage. There are several environments that can make up production but common examples include production and staging. The categorization of production environments can be a gray area but one way to cut through the ambiguity is the following: “if these go down, we wake people up” — Chris Ham, Head of Engineering Operations at Harness.
“If these go down, we wake people up.”
- Chris Ham, Head of Engineering Operations at Harness
Non-Production for product development: These environments are directly tied to the research and development of the product and therefore should be recorded as a R&D expense.
Other non-production environments: These environments are used by various teams across the company for specific reasons and the recording of the expense should follow the team. Examples include demo environments for customer calls, POC environments for trials, customer success environments for testing and training environments for educating users on the product.
Why include it? Separating costs by environment provides insights to finance for how costs will scale over time as various cost drivers move up or down. Below are some examples of cost drivers used for forecasting.
Spend as a % of revenue: This is a ratio of cloud spend to revenue. While this metric may seem like an easy forecasting strategy, it's a lousy approach. Revenue may not equate to cloud spend due onboarding times, license utilization, and pricing strategy so this approach can lead to a false sense of security. Correlation != Causation
Customer usage metrics: These metrics are best used to forecast production environment costs. They track customer actions taken inside of a product, which trigger workloads to run and ultimately produce cloud costs. It is important to partner with engineering to identify a usage unit that is correlated with costs as this bucket has a direct impact on gross margin.
Number of developers writing code: This measure is best used to forecast non-production costs from product development activities. Analyze this trend over time to establish your company’s spend per developer and then work with engineering operations to push cost accountability down to engineering teams.
Active proof of concepts: This figure can be used to forecast sales and marketing environment costs as customers use the product during a given time period. It is important that gatekeepers are established for these environments so costs don’t go out of control for presales activities.
Number of support tickets / number of CSMs: This can be used to predict environmental costs related to customer support issues.
Products/Teams
What is it? As companies grow and scale it is likely they will begin to add additional products and/or core features to their SaaS platform that will change the cloud cost spend profile. Additionally, the company may introduce new go-to-market strategies that will incentivize the customer to try the product before purchase.
Multi product: All of the SaaS products your company sells for revenue that are creating cloud costs.
Freemium: Limited use products that are offered free of charge (but still incur cloud costs).
Why include it? The mix of products and go to market strategies will have an outsized impact on the gross margin profile of your business through not only the cost to serve but the pricing strategy of each product.
Cost to serve: the underlying cost to run an application. Each application is a unique combination of services which produce cloud costs. Understanding how these costs scale is important to the expected future gross margin of a company. Additionally, engineering organizations are increasingly organized around products which allows companies to benchmark against themselves and drive accountability across teams.
Pricing & Packaging: The biggest levers in a company's toolkit to expand gross margin. While pricing and packaging may not simply be a function of cost, it is crucial to understand the correlation between a licensed unit and a dollar of cloud cost growth.
Initiatives
What is it? Initiatives are net new activities that will impact your cloud cost run rate. I tend to view these in two buckets:
Net New Workloads: These can be a mix of new products, features, or services that your engineering team needs to build. It's best to find a friend in FinOps or engineering that can relay these decisions in a finance friendly way.
Cloud Cost Optimization Strategies: This bucket in itself is worth a separate blog post but these approaches boil down to the following:
Contractual Discounts: The lowest hanging fruit of the cost optimization strategies from a finance perspective. This strategy has no impact on the actual infrastructure controlled by engineering but allows companies to take advantage of discount programs such as reserved instances, savings plans, or enterprise discounts and are driven by finance and FinOps teams.
Resource Optimization: One step up from contractual discounts, resource optimization strategies begin to touch the actual infrastructure of a product. Examples include rightsizing, resource downtime scheduling, leveraging spot instances and cost anomaly detection.
Architectural: The hardest of the cost optimization strategies. These are longer term engineering decisions that are the result of engineering deciding to do their work in a different way.
Why include it? Allows finance teams to articulate and quantify the impact of strategic decisions.
In conclusion, the pain of unexpected cloud costs is a reality that finance teams are all too familiar with. But there is hope! By building a cloud cost model that allows you to forecast cloud costs at the right granularity, you can gain a clear understanding of what you're spending and where. In my next blog post, I’ll share practical advice on how to implement these factors and produce a cloud cost forecast. Stay tuned!