Work · Diagnostic artifact

Score your AI portfolio in an afternoon. Quote it in your next exec meeting.

A 6-dimension self-scoring rubric for service firms. Each dimension scores 1 to 5 with a named failure mode per band. Designed to surface where AI is stuck, where the org chart is fighting the work, and whether you need a diagnostic, a build, or a redesign. Director-altitude. One screen. Free.

Book the Audit Get in touch

Pull the exec team into one room. Score each dimension 1 to 5 by consensus, not survey average. Disagreement is the signal. Where the COO scores 4 and head of engineering scores 2, that's where the next conversation lives. The spread beats the total. Plan 60 to 90 minutes.

Bands collapse 1 to 5 into three: 1 to 2 low, 3 medium, 4 to 5 high. Score the band first, then place the 1 to 5 inside it.

Pilot-to-production conversion

What it measures: What share of AI pilots cleared in the last 12 months are running in production today against a live workflow.

Low (1-2): Pilot purgatory. Pilots run for months in an innovation sandbox, never reach the team whose workflow they were supposed to change, and quietly stop being reported on.

Medium (3): Selective handoff. One or two pilots have made it into production, but ownership is unclear and the rest sit waiting on someone to decide.

High (4-5): Default to production. Pilots have a named production owner before they start. The default path is shipping, not piloting.

Decision-flow clarity

What it measures: Whether the decisions an AI workflow has to make are written down, or live in someone's head as oral tradition.

Low (1-2): Oral tradition. The senior IC who knows how a case gets triaged or a contract gets redlined cannot describe the rule. The workflow lives in pattern matching, not in a documented decision tree.

Medium (3): Half-mapped. Some decisions are written in policy docs that nobody reads. Others are tribal. AI gets bolted onto the written part and trips over the tribal part.

High (4-5): Explicit decision flow. Each step in the workflow has a named input, a named decision, a named owner, and a documented escalation rule. An agent can read it. A new hire can read it.

Agent-human co-actor design

What it measures: Whether agents are designed as co-actors with humans in the loop, or bolted onto an unchanged workflow.

Low (1-2): Bolt-on automation. AI replaces a button click in an existing tool. The work, the org, and the headcount math are unchanged. The team treats AI as a chore, not a co-actor.

Medium (3): Augmented IC. One or two roles use AI as a real assist (drafting, summarizing, research) but the workflow upstream and downstream of them hasn't been redesigned around it.

High (4-5): Co-actor by design. Agent and human responsibilities are partitioned explicitly. The agent owns the keystrokes; the human owns the judgment calls and the relationships. The org chart reflects the split.

Eval discipline

What it measures: Whether AI bets are measured against real examples and regression-tested, or shipped on vibes.

Low (1-2): Vibes. Quality is judged by a demo and a feeling. There is no eval set. Prompt changes hit production without regression testing. Failure shows up as a customer escalation.

Medium (3): Spot checks. One person on the team runs a handful of test cases when a prompt changes. The eval set isn't versioned. Coverage is unclear.

High (4-5): Versioned eval suite. Real examples, labeled, versioned, run automatically on every prompt or model change. Regression failures block merge. The eval suite is part of the team's deliverable.

Context layer

What it measures: Whether there is one source of truth your agents and your humans both read from, or scattered docs and chat threads.

Low (1-2): Scattered context. Positioning lives in a deck. Policy lives in a wiki page nobody updates. Decisions live in Slack threads. Agents are prompted with whatever the engineer remembered to paste.

Medium (3): Wiki, but not for agents. The humans have a knowledge base. The agents don't read from it. The two surfaces drift.

High (4-5): Shared knowledge base in place. A versioned, searchable knowledge base your team and your AI both read and write to. Drift is a bug, not a steady state.

Allocation discipline

What it measures: Whether AI investment is split explicitly across H1 efficiency, H2 new capabilities, and H3 transformation, or all in one bucket.

Low (1-2): One bucket. Every AI bet is sold as transformational. Nothing is funded as a 6-month efficiency play. The CFO has no allocation defense.

Medium (3): Implicit split. The team knows which bets are short-term and which are long-term but the split isn't documented. Board decks describe everything as strategic.

High (4-5): Explicit Three-Horizon mix. Budget is split across H1, H2, H3 with named owners, named payback windows, and named failure modes per horizon. The CFO can defend it.

Total scores hide the problem. The signal: which dimensions are stuck at 1 or 2, and which executives disagree on the score.

Three or more dimensions at 1 or 2

You're in pilot purgatory. The fix is a redesign, not another tool. Start with the AI-Native Org Audit.

Strong on eval, weak on context

Strong on eval and pilot-to-production, weak on context and decision flow. You ship AI but it's brittle. Fix: the shared knowledge base plus decision-flow mapping, inside a fractional retainer.

4 or 5 across the board

Skip the diagnostic. You need execution velocity. Move to a Scoped Build or Fractional Head of AI Transformation.

The rubric tells you where you are. It doesn't tell you how to move bands. The AI-Native Org Audit engagement does that: structured interviews, workflow mapping, board-defense language, named owners, a Phase 1 plan for next quarter. The rubric is free. The redesign is the engagement.

The rubric

Use the rubric to

Surface where the exec team disagrees about AI maturity
Audit yourself before a board meeting
Decide whether to start with a diagnostic, a build, or a retainer
Quote a dimension and a failure mode in your next AI conversation

The engagement

Use the engagement to

Map the org redesign that moves three or more dimensions from low to high
Translate the failure modes into a named owner, named workflow, named deliverable
Get the board-defense language for a 6 to 12 month redesign
Run Phase 1 inside the next quarter without breaking the team

Ready to talk?

This is the diagnostic. The AI-Native Org Audit maps the redesign.

If the rubric surfaced three or more dimensions in the low band, the next step is the audit. 3 to 4 weeks, board-ready output, named Phase 1 plan.

Book the Audit