The 4 Stages of AI Maturity: From Prompts to Agents

Or: how one Instagram carousel reveals where you really are.

Most people think they’re “doing AI” because they’ve used ChatGPT. This framework shows the full map — and the fastest way to see it is to watch the same simple job (a client’s Instagram carousel) get made four radically different ways.

Two questions, two tools

This framework answers two questions — and they’re not the same question.

What stage are you at? → The Staircase (Part 1). Which type of AI relationship do you have — prompter, architect, engineer, strategist?
Within your stage, how are you doing? → The Capability × Ease Grid (Part 2). How good is your output, and how much of it runs without you?

One tool tells you where to go. The other tells you what to fix before you get there. Don’t confuse them.

PART 1 — The Staircase

Which stage are you at?

                                    ╔══════════════════════════════╗
                                    ║  LEVEL 4: AGENTIC            ║ you supervise
                                    ║  Paperclip · HEARTBEAT.md    ║ an AI workforce
                        ╔═══════════╩══════════════════════════════╣
                        ║  LEVEL 3: SYSTEMS                        ║ you configure
                        ║  Claude Code · CLAUDE.md · MCPs          ║ the environment
            ╔═══════════╩══════════════════════════════════════════╣
            ║  LEVEL 2: WORKFLOWS                                  ║ you architect
            ║  Claude Projects · Figma + Weave · n8n               ║ flows across tools
╔═══════════╩══════════════════════════════════════════════════════╣
║  LEVEL 1: PROMPTING                                              ║ you use AI for
║  ChatGPT · Gemini · Canva · copy-paste between tabs              ║ specific sub-tasks
╚══════════════════════════════════════════════════════════════════╝

One job. Four worlds.

Stage 1 — Prompt · You are the memory

Five tabs open. ChatGPT for angles, Gemini for images, Canva for layout. You paste the brand guidelines in — again. Download four PNGs, drag them into Canva, tweak, export. Two weeks later you screenshot Insights into a new ChatGPT session because the old one forgot who the client was.

Who’s thinking: you
What persists: nothing
The cost: your full attention, every time

[ you ] ──prompt──▶ [ AI session ] ──output──▶ [ you ]
                          │
                          └── forgets on close

The trap: Great prompters produce excellent work here. It just doesn’t scale past them.

Stage 2 — Workflow · AI in the plumbing

Your Claude Project already knows the brand book. Paste the brief, get angles back. A Figma + Weave flow populates the slide template with draft copy and images. Every Monday an n8n scenario pulls Instagram Insights into a Google Sheet and posts a summary to Slack.

Who’s thinking: you (AI does defined pieces)
What persists: triggers and connections
The cost: three things break before lunch

[ trigger ] ──▶ [ tool ] ──▶ [ AI node ] ──▶ [ tool ] ──▶ [ output ]
                                  │
                    (knows its job, not the whole story)

The trap: Your Claude Project knows the brand. Your n8n flow knows the data. Neither knows the other exists.

Stage 3 — System · The folder is the team

Everything lives in clients/{client}/. /plan-carousel reads the client’s CLAUDE.md for voice and current campaigns. /create-post fires subagents in parallel — copy, images, layout. /analyse-month reads the full metrics history and writes findings into learnings.md. Next month’s run reads that file automatically.

Who’s thinking: the system you built
What persists: context, rules, skills, memory
The cost: deep upfront setup — then the system gets smarter every session

              ┌── CLAUDE.md      (voice, rules)
              ├── skills/        (/plan /create /analyse)
clients/{c}/ ─┼── metrics/       (thickens monthly)
              ├── competitors/   (living database)
              └── learnings.md   (feeds next cycle)

The trap: Rich context rebuilt for every project. Systems everywhere, none of them deep.

Stage 4 — Agent · You stopped operating

A CEO agent owns the client’s marketing outcomes. Its heartbeat fires daily — reads performance, decides what to commission, hires functional managers (Social, Paid, Web, Analytics) who brief Worker agents for each task. Monday morning, an email: “Here’s this week’s plan across socials, ads, and site updates. Approve?” Friday: “Here’s what shipped, what performed, what I’m adjusting for next week.”

You didn’t write a caption. You didn’t open Canva. You didn’t log into Ads Manager. You approved a plan and reviewed a report.

Who’s thinking: an AI org you govern
What persists: goals
The cost: trust — built slowly

          ┌──▶ Social Media Mgr  ──▶ post workers
          │    (IG, LinkedIn, TikTok content)
          │
          ├──▶ Paid Media Mgr    ──▶ campaign workers
          │    (Google, Meta ads)
  CEO ────┤
          ├──▶ Web Engineer      ──▶ deploy workers
          │    (site updates, SEO)
          │
          └──▶ Analytics Mgr     ──▶ report workers
               (metrics, dashboards)

  heartbeat:  wake → observe → act → log → sleep

The spine: HEARTBEAT.md (state between cycles) · Paperclip or Anthropic Agent Teams (orchestration) · cron + RemoteTrigger (the pulse).

The trap: Agents doing “book my calendar” theatre while revenue goes untouched.

PART 2 — The Capability × Ease Grid

Within your stage, where are you?

Knowing your stage is half the map. The other half: within this stage, how are you doing on two axes?

Capability — how good, on-brand, and decision-quality your AI output is.
- Low — generic, templated, could’ve come from anyone.
- High — nuanced, strategic, sounds like you.
Ease — how much of it runs without you babysitting every step.
- Low — manual prompts, constant tab-switching, you’re the glue.
- High — systematised; you set it up, it delivers.

              ┌────────────────┬─────────────────┐
         HIGH │   Automator    │   ★ Advance     │
    EASE      │   scaled       │   quality at    │
              │   mediocrity   │   scale — go up │
              ├────────────────┼─────────────────┤
         LOW  │   Dabbler      │   Craftsman     │
              │   basic and    │   great output  │
              │   manual       │   all manual    │
              └────────────────┴─────────────────┘
                LOW                         HIGH
                  CAPABILITY  (output quality)

★ Advance — high quality, running without you. Level up.
Craftsman — brilliant output, all manual. Systematise before levelling.
Automator — running without you, but the output is generic. Fix quality before scaling more.
Dabbler — pick one lever: craft or one simple automation. Don’t try both.

Both traps feel like progress. Neither is.

Where are you? (60-second diagnostic)

Interaction mode?
Conversations → L1 · Automations → L2 · Configured environment → L3 · Autonomous goals → L4
Capability?
Is the output decision-quality, or does it still need heavy editing?
Ease?
If you stopped for two weeks, would it still work?
Your move:
★ → advance · Craftsman → systematise · Automator → improve quality · Dabbler → start somewhere

Don’t skip stages. A Stage 3 system on weak Stage 1 foundations is a rich environment around shallow thinking. A Stage 4 agent built too early is an autonomous agent doing the wrong things, fast.

Framework at a glance

	Prompt (L1)	Workflow (L2)	System (L3)	Agent (L4)
Mode	Conversation	Automation	Environment	Autonomy
Persists	Nothing	Triggers	Context & rules	Goals
Your role	Operator	Architect	Engineer	Strategist
Craftsman trap	Slow, great prompts	High-maintenance flows	Rebuilt every project	Needs babysitting
Automator trap	Scheduled generic prompts	Scaled forgettable output	Shallow prompts everywhere	Trivial agents
★ Ready when	Fast, quality prompts	Reliable on-brand automations	Always-on, deep context	Complex goals, minimal oversight