A FinOps framework for 50-500 person companies

Mid-market FinOps is the discipline of managing cloud spend at companies large enough that the bill is material (typically $30k-$1M per month) but too small to justify a dedicated FinOps team. The standard FinOps Foundation framework was designed for enterprise; the standard cost-tool playbooks were designed for hyperscale. Neither maps cleanly to mid-market. This piece describes the framework we use across mid-market engagements — four practices, run by an existing platform engineer with 25-30% of their time, organized around the constraints that actually bind at this scale.

Why mid-market FinOps is different

Three properties define the mid-market position and shape what works:

Bill size is interesting but not strategic. A $200k/month cloud bill is enough to merit attention from the CFO but not enough to justify a director-level FinOps hire. The investment that pays back has to be delivered by existing staff.
The engineering team is the FinOps team. There is no separate function. The platform engineer who sets up the EKS cluster is the same person who has to think about the cost of the EKS cluster.
The bill is dominated by a small number of services. At enterprise scale, cloud spend is distributed across hundreds of services. At mid-market, 80% of spend is typically in 5-8 services. The optimization surface is smaller and more tractable.

These three properties collapse the FinOps problem into something more focused than enterprise frameworks suggest. We do not need 47 capabilities; we need four practices, run well, by people who also do other things.

The Atlas approach below is not the only framework that works. CloudKeeper, Vantage, and Apptio all publish quality material — when they own a topic, we route readers there. What we are doing here is the layer those vendors deprioritize because their economics target larger customers.

The four practices

Practice	What it produces	Who owns it	Time cost
Visibility	Clear, current view of where the money goes, by team and product	Platform engineering	Setup: 1 week. Ongoing: 1 hr/week
Commitment	The right level of reserved capacity / savings plans for steady-state workloads	Platform engineering + finance	Setup: 2 weeks. Ongoing: 1 hr/month
Rightsizing	Compute, storage, and managed-service tier selection that matches actual usage	Platform engineering	Setup: 2 weeks. Ongoing: 2 hrs/week
Lifecycle	Decommissioning of unused resources; review of staying-resources	Platform engineering	Setup: 1 day. Ongoing: 1 hr/week

The total ongoing time cost is roughly 25-30% of one platform engineer. This is the mid-market budget. Anything that requires more is over-engineered for this scale.

We will walk each practice with what good looks like, what failure looks like, and the assessment criteria for where you are.

Practice 1: Visibility

Visibility is the answer to "where does the money actually go." Without it, every other practice is guesswork.

The minimum useful visibility model:

Per-team / per-product attribution. Every dollar of cloud spend should map to a named team and a named product. The standard mechanism is tagging — every resource carries team, product, and environment tags applied at provision time. Untagged resources go into an "unattributed" bucket that should be under 5% of spend.
Monthly trend per service. A simple line chart per service (EC2, RDS, S3, EKS, the ones that matter for you) showing month-over-month spend. Not a sophisticated dashboard — just the basics, visible to the engineering team and to finance.
Anomaly alerts. Email or Slack notification when a service's daily spend deviates >30% from its 30-day average. AWS Cost Anomaly Detection, Vantage, or a simple cron job querying Cost Explorer all work.
Per-engineer cost awareness loop. When an engineer provisions resources, they should see the cost implication. This is harder than the others to build; the easiest version is a quarterly review where each team sees their attributed spend.

What good looks like: an engineering manager can answer "what is my team's monthly cloud spend, and why" in under five minutes without help. A finance partner can answer "which product line is driving the cloud cost growth this quarter" without an engineer's help.

What failure looks like: monthly bill arrives, finance forwards it to engineering leadership, leadership forwards it to the platform team to "look into," and 80% of the discussion happens about the same 3 services every month with no underlying instrumentation to say why.

Maturity self-check:

Question	Yes / No
Can you produce a per-team monthly spend report in under 30 minutes?
Is "unattributed" spend under 5% of total?
Do anomaly alerts trigger before someone notices on the bill?
Has each engineering team seen their spend in the last 90 days?

Three or four "yes" — visibility is in good shape. Two or fewer — start here.

Practice 2: Commitment

Once you can see steady-state spend, the next question is: how much of it should be on committed capacity (Reserved Instances, Savings Plans, Compute commitments) vs on-demand?

The mid-market trap: companies either over-commit (locking up capital on capacity they won't use) or under-commit (paying on-demand premium for predictable workloads). The right answer depends on workload predictability, growth trajectory, and free cash flow.

Our default for mid-market: commit to 60-70% of trailing 90-day baseline compute spend with 1-year terms (not 3-year). This captures most of the discount while preserving optionality. The 3-year terms make sense at enterprise scale where the workload mix is more stable; at mid-market, where a re-architecture or a vendor change can shift the workload mix substantially in 18 months, 1-year terms hedge that risk.

The mechanics:

AWS: Compute Savings Plans for flexibility across instance families and regions. EC2 Instance Savings Plans for steady-state on a specific instance family. Reserved Instances for RDS, ElastiCache, OpenSearch.
GCP: Committed Use Discounts (CUDs) — flexible CUDs for compute, resource-based for predictable.
Azure: Reserved VM Instances and Azure Savings Plan for compute.

The mistake we see most often: committing based on a single peak month. If your spend was $100k in March because of a launch, don't commit to $100k baseline. Commit to the 90-day trailing average minus a buffer for new workloads coming online.

The other common mistake: ignoring commitment because "we are growing too fast." High-growth companies have *more* steady-state baseline workloads — the legacy services that keep running while new ones get added. Commit to that floor; let growth happen on on-demand.

Commitment is the single highest-ROI practice for most mid-market companies in our engagement data. The work is one or two days of analysis; the savings are 10-25% of the relevant compute spend, recurring monthly.

Practice 3: Rightsizing

Rightsizing is matching the size and type of resources to actual workload requirements. It is the practice that produces the most engineering content because it is the most technically interesting; it is also the practice where the cost-to-savings ratio is least favorable.

The mid-market discipline: rightsize the largest 20% of resources, ignore the long tail. A $40/month over-provisioned RDS instance is not worth an engineer's afternoon. A $4000/month over-provisioned EKS cluster is.

The standard rightsizing surface:

Compute: Are EC2 / GCE / Azure VM instances matched to workload CPU and memory utilization? AWS Compute Optimizer, GCP Recommender, and equivalent surfaces will tell you. The rightsizing engine recommends; the engineer decides whether the workload is actually steady or has burst characteristics that argue for a slightly larger instance.
Storage: Are EBS volumes / GCP persistent disks at appropriate tiers? gp3 vs gp2 on AWS is usually a free win; throughput-to-capacity ratios are usually wrong by default.
Managed services: RDS instance class, OpenSearch sizing, MSK broker count. Each has its own rightsizing surface and each is worth periodic review.
Containers: Pod requests and limits in Kubernetes. The disconnect between requested resources and actual usage is often 3-5x at companies that have not tuned. Vertical Pod Autoscaler in recommender mode will surface this.

What good looks like: the largest 20 resources by spend get a quarterly rightsizing review. Recommendations are evaluated by the team that owns the workload. Decisions to act or not act are documented.

What failure looks like: rightsizing recommendations accumulate in a dashboard nobody reads. Or, conversely, an over-zealous monthly review that produces churn — instances resized smaller, workloads degrade, instances resized larger, with no net effect on spend.

The discipline is to act on the high-confidence cases (the recommendations have been stable for 30+ days, the workload is well-understood) and ignore the noise.

Practice 4: Lifecycle

Lifecycle is the practice of decommissioning resources that nobody uses anymore, and reviewing the staying ones for whether they should still exist.

This is the unglamorous practice. It is also the one that produces the largest one-time savings in most engagements.

The lifecycle review checklist:

Idle resources. Compute instances at <5% CPU for 14+ days. Storage volumes attached to nothing. Load balancers with zero traffic. Each public cloud has a "trusted advisor"-style surface for these; check it monthly.
Orphaned resources. Snapshots whose source volume is gone. Old AMIs nobody references. NAT gateways in environments that have been deprecated.
Test and dev resources outside business hours. Non-production workloads do not need to run 24/7. Scheduling them down nights and weekends is a 50-65% cost reduction on those resources.
Forgotten projects. Workloads launched for an initiative that ended. The team that owned them has moved on. Nobody has thought to turn them off.

The lifecycle audit on a typical mid-market environment uncovers savings of 7-14% of compute spend in the first month. Set up the recurring monthly review and capture continues at a smaller rate ongoing.

The reason this practice is not done well at mid-market: it requires saying "this is unused, we are turning it off" with confidence. Engineers who don't own the workload are reluctant to recommend the kill. The fix is process — a monthly lifecycle review with a default-to-decommission rule for resources that have been flagged for two consecutive cycles, with the workload owner having the burden of justifying continued existence.

Where these practices interact

The four practices are not independent. They reinforce in specific ways:

Visibility enables commitment. You can't commit appropriately without knowing the steady-state baseline by service.
Visibility enables lifecycle. You can't decommission what you can't see.
Lifecycle precedes rightsizing. Don't rightsize a resource you might be turning off.
Commitment follows rightsizing. Don't commit to capacity you're about to reduce.

The order to attack: visibility first, lifecycle second, rightsizing third, commitment fourth. We have done it in this order on every mid-market engagement we have run. Other orders work too; this one minimizes wasted work.

Implementation: a 90-day plan

For a mid-market company with no existing FinOps practice, a realistic 90-day plan:

Days 1-14: Visibility setup. Tag enforcement on new resources via CI policy (Terraform module, Cloud Custodian, or equivalent). Per-team monthly attribution report. Anomaly alerts wired to the engineering Slack.

Days 15-30: Lifecycle audit. Idle resources, orphaned resources, after-hours scheduling for non-prod. Document what was killed and the savings.

Days 31-60: Rightsizing pass on the top 20 resources by spend. Workload-owner review of each recommendation. Apply the high-confidence ones; defer the rest.

Days 61-90: Commitment analysis. Ninety-day baseline computation, commitment recommendation, finance review, purchase. Set up the next quarterly review.

After 90 days, the practice runs ongoing at the 25-30% of one engineer's time we mentioned. The first 90 days are higher intensity; sustaining costs are lower.

Tooling considerations

The mid-market tooling decision usually comes down to: native cloud surfaces, third-party platform, or in-house dashboards.

Native cloud surfaces (AWS Cost Explorer + Compute Optimizer, GCP Cost Management + Recommender, Azure Cost Management + Advisor) are sufficient for most mid-market needs. Free, integrated, fine.
Third-party platforms (CloudKeeper, Vantage, Apptio Cloudability, Anodot, Spot.io, ProsperOps for commitment automation) add value when you have multi-cloud, when finance needs a cleaner UI than the native surfaces provide, or when commitment management is a meaningful percentage of spend and the automation pays back.
In-house dashboards built on the cloud's billing exports (CUR for AWS, BigQuery exports for GCP, Cost Management exports for Azure) are appropriate if you have one engineer who wants to own the data model and you have non-standard attribution requirements.

We have run engagements with all three. For most mid-market customers, native surfaces plus one specialist tool (often ProsperOps for commitment automation) is the right configuration. CloudKeeper or Vantage become more valuable as you cross $500k/month spend or add a second cloud.

Where this framework breaks

Honest limitations:

It assumes a single cloud, primarily. Multi-cloud at mid-market scale adds attribution and commitment complexity that the framework does not detail.
It does not address data egress as a primary line item. For media or data-heavy companies where egress is a significant share of spend, that needs separate treatment.
It does not cover SaaS-tool spend. This is cloud infrastructure; SaaS license optimization (Datadog, Snowflake, Mongo Atlas) is adjacent and benefits from related practices but is not the same problem.
It assumes the engineering team has any time to give to this. At companies in active fundraising or scaling crisis, the 25-30% of an engineer doesn't exist. The framework still applies; the timeline stretches.

How to assess where you are

Score 0-3 on each practice:

0: Not in place
1: Documented intention; no implementation
2: Implemented for new workloads; legacy workloads grandfathered
3: Implemented + measured + reviewed regularly

Practice	0	1	2	3
Visibility	No per-team attribution	Tagging policy exists, not enforced	Tags enforced via CI; monthly reports run	+ per-engineer awareness loop
Commitment	All on-demand	One-off RI / SP purchases ad hoc	1-year commitments matched to baseline	+ quarterly reassessment
Rightsizing	No recommendations consumed	Recommendations reviewed but rarely acted on	Top 20 rightsized quarterly	+ autoscaling tuned + container-resource right-sized
Lifecycle	No process	Sporadic cleanups	Monthly lifecycle audit	+ scheduled non-prod shutdown

Score interpretation:

0-4: Foundational. The work is mostly process and tagging discipline. An internal lead can drive this without external help.
5-8: Walking. Largest remaining ROI is usually commitment + lifecycle. Pick the lower-scored practice and bring it to a 2 before adding sophistication elsewhere.
9-12: Mature for mid-market. Sustaining work; consider whether scale (toward enterprise FinOps) justifies investment in advanced practices the framework does not detail.

FAQ

Q: At what cloud spend level does this framework start to pay back? Roughly $30k/month. Below that, the procedural overhead exceeds the savings. The break-even rises slightly for companies with very small, simple cloud footprints.

Q: How does this differ from the FinOps Foundation framework? The FinOps Foundation framework is comprehensive and capability-based, written for organizations large enough to staff a FinOps team. Ours is a focused subset designed to be runnable by an existing platform engineer with no dedicated FinOps role. We use the FinOps Foundation framework for enterprise engagements; the four-practice version is the mid-market specialization.

Q: We're growing 100% year-over-year. Is this framework still useful? Yes, with timeline adjustments. High-growth companies benefit most from visibility (cost-per-customer attribution becomes critical for unit economics) and commitment (the steady-state floor still grows even as growth happens on top of it). Rightsizing matters less because workloads keep shifting.

Q: We're using a serverless-heavy architecture. Does this still apply? Mostly. Visibility and lifecycle apply directly. Commitment is reduced (Lambda doesn't have RIs; some serverless DBs do). Rightsizing shifts toward function memory tuning and database tier selection. The framework structure is portable; the specific controls under each practice differ.

Q: How do we handle the case where one team's workloads dominate cloud spend? The framework still organizes the work; the per-team attribution makes the dominance visible and informs where to focus. Sometimes the answer is to push cost-awareness to that team's leadership. Sometimes the answer is that the workload is doing important work and the spend is justified.

Q: What's the right tool budget for mid-market FinOps? Most mid-market companies should start with native cloud surfaces (free) plus possibly one specialist tool. Realistic budget for tooling: 1-2% of cloud spend. If you are spending more on FinOps tooling than 2% of cloud spend, the tooling has crossed into the "not worth it for your scale" zone.

*If you are evaluating where your organization sits against this framework and would like a working session to map the next 90 days, our team can help. The framework is open; you can run it without engaging us.*

A FinOps framework for 50-500 person companies

Why mid-market FinOps is different

The four practices

Practice 1: Visibility

Practice 2: Commitment

Practice 3: Rightsizing

Practice 4: Lifecycle

Where these practices interact

Implementation: a 90-day plan

Tooling considerations

Where this framework breaks

How to assess where you are

FAQ

Mohit Sharma

Stay Updated

Related Articles

Multi-Region Deployment Strategies for Low-Latency Indian Applications

Ultimate Cloud FinOps Savings Guide for 2026

Ready to Transform Your Cloud Infrastructure?