Between Q3 2024 and Q1 2026 we ran seven AI enablement engagements for mid-market companies (50-500 employees, varied industries: B2B SaaS, e-commerce, healthcare-tech, financial services, manufacturing). Each engagement followed a customized version of our AI enablement roadmap. Across the seven, six produced sustained AI capabilities; one was halted at month 4 by leadership change. This piece draws the cross-engagement lessons — what worked across all of them, what failed in specific contexts, and what we have updated in our framework as a result.
The engagements at a glance
| Engagement | Industry | Employees | Engagement length | Outcome |
|---|---|---|---|---|
| A | B2B SaaS (analytics) | 140 | 6 months | 4 production AI features, ongoing capability team |
| B | E-commerce (specialty retail) | 280 | 9 months | 2 customer-facing AI features, 1 internal automation |
| C | Healthcare-tech (devices) | 180 | 12 months | Internal AI tooling, no customer-facing AI deployment |
| D | Financial services (lending) | 360 | 8 months | 1 customer-facing AI; governance-heavy implementation |
| E | B2B SaaS (vertical CRM) | 95 | 5 months | 3 production AI features, AI-as-product-differentiator |
| F | Manufacturing (industrial software) | 420 | 10 months | Halted at month 4 due to CEO change; partial deliverables |
| G | B2B SaaS (developer tools) | 220 | 7 months | 5 production AI features, AI-native re-architecture |
The engagements varied in scope, duration, industry, and starting maturity. The cross-engagement patterns are what hold despite the variation.
Lesson 1: Discovery is undervalued; skip it at your peril
Of the seven engagements, three started by trying to skip Discovery — leadership had already decided what AI to build, and we were brought in to "execute the plan." In two of the three cases (Engagements C and F), the pre-decided plan turned out to be the wrong target.
Engagement C started with a mandate to build a customer-facing diagnostic-suggestion AI for medical devices. The Discovery work we insisted on revealed that:
- The data the AI would need was held by hospital partners under contracts that did not permit AI use
- The regulatory pathway for the customer-facing application was 18+ months and required clinical evidence the customer did not have
- Internal tooling for the medical device support engineers had data available, regulatory clarity, and clear ROI
We pivoted to internal tooling. The customer-facing application stayed on the roadmap as a 24-month plan with appropriate clinical and regulatory work.
Engagement F was halted before this lesson played out, but the same dynamic was visible: the pre-decided AI initiative was leadership's reaction to industry buzz rather than a use case grounded in customer or operational need.
The pattern: organizations are better at identifying that AI matters than they are at identifying which specific AI work matters for them. Structured Discovery surfaces the gap. We have updated our framework to make Discovery explicit and non-skippable, even when the customer believes they know the answer.
Lesson 2: Foundation work pays back across applications
The Foundation stage (data access, tooling baseline, governance, skill development) accounted for 30-45% of engagement effort across all seven. It produced no visible customer-facing output during the foundation phase. It also enabled every subsequent application to launch faster than it would have otherwise.
Engagement A invested heavily in evaluation infrastructure during weeks 5-10. The first AI feature launch in week 16 used the infrastructure end-to-end. The second feature launch in week 22 reused it without re-implementation. The third and fourth features by month 5 each took 3-4 weeks to ship instead of the 8-10 they would have taken without the infrastructure.
The pattern repeats. The temptation is to skip Foundation and ship the first AI feature faster. The cost is paid on every subsequent feature.
The framework update: we now budget Foundation work explicitly at 30-40% of engagement scope, and we communicate this expectation upfront so the customer's expectations match the rhythm. The first feature ships in month 4-5, not month 1-2.
Lesson 3: Governance saves more time than it costs
This was counter-intuitive when we started. Documenting acceptable use, vendor evaluation criteria, incident response, and risk register felt like overhead that slowed engineering velocity. In practice, the engagements that built governance early shipped faster overall.
The mechanism: governance answers questions before they slow down individual decisions. When an engineer asks "can I send customer data to this AI vendor?", the AUP answers in seconds rather than triggering a multi-week security review. When an AI feature has a question about appropriate behavior, the use case review process resolves it in days rather than escalating to leadership.
Engagement D had the heaviest governance investment (financial services regulatory environment). The investment paid back when a customer's procurement team asked detailed AI governance questions in a contract negotiation. Having the documents ready saved 6 weeks of contract delay.
Engagement F (the halted one) had not yet built governance. When leadership changed, the new CEO asked "what are our AI policies?" and there was nothing to show. Governance work might not have saved the engagement, but its absence accelerated the loss of confidence.
The framework update: governance is now part of Foundation, not a later stage. We start the AUP and vendor evaluation criteria in week 2 of every engagement.
Lesson 4: The AI capability function structure depends on company stage and culture
Across the seven engagements, four different AI capability function structures landed:
- Engagement A and E (smaller companies): Guild model, with one part-time coordinator
- Engagement B, D, G (mid-stage companies): Hub-and-Spoke with 2-3 dedicated AI engineers
- Engagement C (regulated environment): Hybrid with a dedicated team for governance plus distributed engineers for application work
- Engagement F: Did not reach this stage
We had assumed at the start that the structure choice was primarily driven by company size. The engagements taught us that culture matters as much:
- Companies with strong product team autonomy traditions (Engagement E and G) resisted central AI teams; Guild and Hub-and-Spoke fit better
- Companies with central platform engineering traditions (Engagement A and D) accepted dedicated AI structures more readily
- Companies in regulated industries (Engagement C, partly D) needed dedicated governance functions even at smaller sizes
The framework update: our AI Center of Excellence piece now addresses culture as a primary input alongside size. The decision framework includes both.
Lesson 5: Customer-facing AI is harder to ship than internal AI; both have their place
The seven engagements produced 14 customer-facing AI features and 9 internal-facing AI tools. The customer-facing features took an average of 1.6x longer to ship from concept to production. The reasons cluster:
- Customer expectations of AI behavior are higher than internal-user expectations
- Quality bars are stricter (a wrong answer in production has external visibility)
- Brand alignment requires more iteration on tone, voice, behavior
- Multi-tenancy and per-tenant configuration add engineering overhead
This is not a recommendation against customer-facing AI; sometimes that is the strategic priority. It is a recommendation to be realistic about timeline and to consider internal AI tools as faster paths to demonstrating value.
Engagement E went customer-facing-first because AI was the product differentiator. Engagement A and B went internal-first to build capability before exposing AI to customers. Both worked. The mistake we have not seen made: trying to ship customer-facing AI as the first ever AI capability without internal experience.
The framework update: for engagements where the customer wants customer-facing AI as the first deliverable, we now recommend a parallel internal AI deployment to build the capability and surface the learnings before customer launch.
Lesson 6: Cost surfaces matter and surface late
Five of the seven engagements had AI cost surprises in production. The pattern: AI features launched, usage exceeded projections, monthly bills came in higher than budgeted, leadership asked questions, engineering scrambled to optimize.
The optimizations consistently produce 30-60% cost reduction (per the GenAI cost framework piece). The work is not technically hard. The pattern is that nobody owned cost during the build, so it surfaced as a problem after launch.
The framework update: cost ownership is now assigned during build, not post-launch. Each AI application has a named cost owner from week 1. Cost reviews happen monthly, not quarterly. The framework adds explicit budget projection in the use case review process.
Lesson 7: The buy-vs-build decision is per-use-case and reverses sometimes
Across the seven engagements, the build-vs-buy mix:
| Approach | Use cases | Outcome |
|---|---|---|
| Buy (turnkey vendor) | 11 | 8 successful, 3 abandoned/replaced |
| Build (custom on base models) | 6 | 5 successful, 1 abandoned |
| Hybrid (vendor model, custom application) | 14 | 12 successful, 2 in optimization |
The Buy abandonments cluster around customization limits — the vendor was good at the generic case but didn't support the specific behavior the customer wanted as the application matured. Two of the three abandoned Buy cases were rebuilt as Hybrid with reasonable success.
The Build abandonment was a case where the team built before validating with customers; the application worked but customers didn't use it. The fix would have been more Discovery, not different technology.
The framework update: our build-vs-buy piece now explicitly acknowledges that the decision can reverse. Buy is the right starting point for unfamiliar use cases; Hybrid or Build becomes appropriate as the use case matures. Architectural choices in the Buy phase (data export, contract terms) should preserve the option to migrate.
Lesson 8: Executive sponsorship determines engagement success more than technical choices
The clearest cross-engagement signal: engagements with engaged executive sponsorship throughout produced strong outcomes; engagements where sponsorship faded mid-engagement produced weaker outcomes; the one engagement that was halted (F) lost sponsorship entirely.
What "engaged sponsorship" looks like in practice:
- The sponsor blocks competing priorities from displacing AI work
- The sponsor attends quarterly governance reviews
- The sponsor is willing to defend the engagement to other executives
- The sponsor has resources to allocate when surprises emerge
Technical choices matter, but they matter less than the political-organizational reality of who is defending the work in executive conversations.
The framework update: we now require named executive sponsor confirmation as part of engagement entry. We confirm the sponsor's continuing engagement at every monthly review. If sponsorship is wavering, we surface it immediately rather than continuing to ship deliverables that may not land.
What we are still working on
Three areas where the framework has not converged:
Agent capabilities. The agent worker pattern we describe in GenAI deployment patterns is moving fast. None of the seven engagements deployed agents in production at scale; one (G) shipped a limited agent-style internal tool. Our framework guidance on agents is currently tentative; we expect to update it materially as the underlying capability matures and we run more engagements with agent components.
Multi-modal applications. Two of the seven engagements considered multi-modal AI features (image understanding for product catalogs, audio for customer interactions). Both deferred. The capability exists; the operational maturity around multi-modal evaluation, governance, and cost is less developed than text-only.
AI-native organizational design. Engagement G (developer tools company) explicitly redesigned aspects of the engineering organization around AI capability. The lessons from this re-architecture are still developing. Whether mid-market should pursue AI-native re-architecture or absorb AI capability into existing structures is an open question we will write more on as the evidence matures.
What we won't do differently
A few things that worked across all seven engagements that we will not change:
- Discovery as a non-skippable stage. Strengthened, not removed.
- Foundation as 30-40% of engagement effort. Right level; not less.
- Governance from week 2. Has saved time on every engagement; will continue.
- Quarterly maturity assessments using the maturity model. Provides anchoring; we will keep.
- Honest pre-engagement scoping about timeline. Customers prefer realistic estimates that hold to optimistic estimates that slip.
What this means for organizations starting now
If you are starting an AI enablement program at a mid-market company today:
- Start with Discovery, even if you think you know the answer. Especially if you think you know the answer.
- Budget Foundation at 30-40% of effort. First customer-visible AI feature ships in month 4-5, not month 1-2.
- Build governance from the start, not after launch. It saves time you will spend later.
- Pick an AI capability function structure aligned to your company size and culture. Don't import the model from a company 10x your size.
- Plan customer-facing AI after internal AI experience. Use internal AI to build the capability before exposing AI to customers.
- Assign cost ownership during build. Don't let cost surface as a post-launch problem.
- Buy first; build later as the use case matures. Preserve the option to migrate.
- Confirm executive sponsorship is real and ongoing. It determines outcomes more than any technical choice.
FAQ
Q: Which industry's engagement was hardest? The healthcare-tech engagement (C) had the longest timeline due to regulatory complexity. The financial services engagement (D) had the most governance overhead. Both produced strong outcomes; the timelines and resource intensities were higher than non-regulated environments.
Q: What's the typical engagement cost relative to the value created? Engagement cost was a small fraction of the year-1 value created in the successful engagements. The value calculus depends on what the AI capability enables (revenue, cost reduction, retention, etc.); the engagement cost was a one-time investment with multi-year payoff in successful cases.
Q: How did Engagement F's halt affect the framework? We now insist on documented executive sponsor confirmation at engagement start and at each monthly review. We also surface concerns about sponsorship more directly when we observe them. Whether this would have saved Engagement F is uncertain; the leadership change was outside the engagement's scope.
Q: Are there engagements you would not take on? Yes. We decline engagements where: AI is being adopted as a reactive response to industry buzz without strategic grounding; the executive sponsor is not engaged at the start; the timeline expectation is unrealistic (months not weeks); or the technical foundation is so weak that the engagement would be entirely platform engineering. We have declined four engagements in the last 18 months for these reasons.
Q: How do you decide between in-house AI hire vs consulting engagement? For sustained internal AI capability building, an in-house hire is the right path. For a defined-scope program with clear deliverables, an engagement makes sense. We sometimes work alongside in-house AI hires — they handle ongoing operations while we provide framework, structure, and outside perspective during the program's setup window.
*For the broader framework these engagements applied, see our pillar on the AI enablement roadmap for mid-market. For the specific structures and practices that emerged from cross-engagement learning, see our pieces on AI Center of Excellence structure, LLM governance framework, and the AI enablement maturity model.*


