Stop Counting How Many People Use ChatGPT. Start Counting Agents That Ship Work.
Your last board update had a slide with an adoption percentage on it. Somewhere underneath that number was a count: how many of your people logged into ChatGPT, or Copilot, or Gemini, at least once that month.
Your last board update had a slide with an adoption percentage on it. Somewhere underneath that number was a count: how many of your people logged into ChatGPT, or Copilot, or Gemini, at least once that month.
That number is worthless.
It tells your board nothing about whether AI is shipping work. It tells your CFO nothing about ROI. It tells your COO nothing about throughput. It is the AI-era version of measuring developer productivity by lines of code. Countable. Comfortable. Wrong.
The metric your board actually wants, even if they have not figured out how to ask for it yet, is this: how many agents do you have in production today that ship work no human had to do?
That is a smaller number. It is also the only one that matters.
Why "seats" became the default metric
Vendors sell seats. So vendors report on seats. The dashboard your CIO bought from the vendor measures seats because the vendor's ARR is tied to seats.
When the CIO walks into the board meeting, the only data they have at hand is the seat data. So that becomes the adoption metric. Then it becomes the success metric. Then it becomes the budget justification.
None of those transitions were ever debated. They happened because the dashboard existed and nobody built a better one.
This is how you end up reporting near-universal ChatGPT adoption while your billing rates, throughput, and margin sit where they were a year and a half ago. The metric was measuring activity. The board thought it was measuring outcome. Nobody noticed until the second year of spend, when the CFO asked the question out loud.
Three composite patterns of the wrong-metric trap
The three situations below are illustrative composites, not single clients. Names, dollar figures, and headcounts are not real. The shape of each failure is. I have seen versions of each one repeat across service-firm AI rollouts.
The mid-sized agency that adopted Copilot and got nothing
A mid-market digital agency rolls out Microsoft Copilot company-wide. Six months in, the IT director reports high monthly active usage. The CEO puts it in the board deck.
Employees save time on email drafting and meeting summaries. Across the company that aggregates to a number that sounds enormous.
None of it converts into billable work. None into new client wins. None into reduced headcount. The savings disappear into longer lunches, more Slack, and a slightly less stressful Friday afternoon. The agency pays meaningful license money for Copilot and gets back essentially zero P&L impact.
The metric said success. The P&L said nothing happened.
The legal services firm at full ChatGPT adoption
A mid-market legal services firm gives every attorney and paralegal a ChatGPT Enterprise seat. Adoption is essentially universal within months. The Managing Partner cites it in industry interviews.
Billing rates do not move. Realization rates do not move. Matter throughput does not move. The associates use ChatGPT to draft faster, which means they bill fewer hours per matter, which means revenue per matter goes down even as productivity goes up. The firm's comp model rewards hours billed, not work shipped. So the AI investment is actively eroding revenue while the adoption dashboard says everything is green.
The metric said success. The compensation model said the opposite of what AI was doing.
The PE-backed services co with many stalled pilots and one shipping agent
A PE-backed business services company runs a portfolio of pilots across most of its functions. The Chief Transformation Officer reports "AI activity in every function." The board gets a green slide.
In reality, only one agent is shipping work. A contract review agent in legal-ops processes a steady volume of contracts, replacing meaningful paralegal capacity. The other pilots are dormant or producing summaries no decision-maker is using.
The CTO has been reporting pilot count as the metric. The board thought the company had AI capabilities across the portfolio. It had one. The rest were theater.
The metric said breadth. The reality was concentration.
The metric that actually works
Replace adoption with this:
Production agents in operation × time saved per task × tasks shipped per period = value created.
Then map every production agent to one of three horizons (the McKinsey Three-Horizon framework, originally from Baghai et al., 1999, applied here to AI portfolio allocation) 1.
- Horizon 1: Defend. Agents that protect current margin by eliminating cost, time, or error in work the company already does. Contract review, invoice processing, intake triage, ticket routing, expense audit. Measured in hours saved per period, FTE-equivalent, error rate reduction.
- Horizon 2: Expand. Agents that grow current revenue lines by increasing throughput, raising win rate, or expanding into capacity the company could not previously serve. Lead qualification at scale, proposal generation, account expansion analysis, customer success automation. Measured in revenue lift per period, win rate delta, accounts served per FTE.
- Horizon 3: Transform. Agents that open product lines, service offerings, or business models the company could not deliver before. New service tiers, productized advisory, data-as-a-service offerings, AI-native versions of existing services. Measured in net new revenue from offerings that did not exist a year ago.
The pattern I see running this exercise: most companies have the bulk of their AI investment chasing Horizon 1 (because it is easy to measure), a smaller slice in Horizon 2, and almost nothing in Horizon 3. The CFO will accept that mix as long as you can show the math on H1 and a credible plan for H2 and H3 inside the next 18 months.
What the CFO will not accept, once you have shown them this framework, is going back to a "seats" report.
A 3-number board defense template
These are the three numbers your CFO will accept, and the only three you should put in front of your board. Steal this template.
Number 1: Production agents shipping work today
A count. Not pilots. Not POCs. Agents that processed real work in the last 30 days against a real production workflow with a real owner.
The math: list each agent, name the workflow, name the owner, state the volume processed last month.
Number 2: FTE-equivalent capacity created in the last quarter
For each production agent, calculate: tasks shipped per quarter × average human time per task = hours equivalent. Divide by a quarter of FTE hours to get FTE-equivalents.
Total across all production agents. That is your real adoption number. It is almost always shockingly smaller than the seats number, and it is the one your CFO can defend.
Number 3: Horizon mix of the current portfolio
Percentage of AI spend mapped to H1, H2, H3. Plus the percentage you target a year out.
The math: if your current mix is H1-heavy and your plan a year out moves real weight into H2 and H3, you have a credible transformation thesis. If your current mix is H1-heavy and your plan a year out is the same, you are not transforming. You are optimizing, and the board should know the difference.
This is the slide. Three numbers. Defensible math. No vanity metrics.
The self-test before your next board meeting
Score yourself. 1 point each.
- Can you name every agent in production today, who owns it, and what workflow it ships?
- Can you calculate FTE-equivalent capacity created in the last quarter, with the math on one slide?
- Can you state your current H1/H2/H3 spend mix, and your target mix a year out?
- Have you replaced "AI adoption" or "AI usage" with one of these three numbers in your last board update?
- Can your CFO defend each of those three numbers to an auditor with the math?
4 or 5 yeses: you are running AI as a P&L investment. Keep going. 2 or 3 yeses: you are mid-transition. Finish the work. 0 or 1 yeses: your board is being shown vanity metrics. Fix that before the next quarter.
What to do next
The Three-Horizon allocation framework is the operating model I install inside every Fractional engagement. It replaces the seats dashboard, gives the CFO numbers they can defend, and gives the CEO a strategic conversation about AI that does not start with "how many people are using it."
Stop reporting adoption. Start reporting shipped work.
See how the Three-Horizon framework works
Sources
1. The Three-Horizon framework comes from McKinsey, originally published in The Alchemy of Growth by Mehrdad Baghai, Stephen Coley, and David White (Perseus Books, 1999). Its application here to AI portfolio allocation is my own adaptation. ↩