Convergent
Software Development
Workflow
An AI agent-centric software development framework for Convergent's internal team
1. Introduction
This document describes the Convergent Software Development Workflow (SDW) โ a structured, AI agent-centric framework designed for Convergent's internal software development team. It is adapted from Matt Pocock's engineering methodology (a solo-developer workflow) and extended for a multi-person team context with AI agents as first-class participants.
Why Matt Pocock?
Pocock's skills โ grill-with-docs, to-prd, to-issues, triage, tdd, diagnose โ represent some of the most battle-tested AI-agent interaction patterns available. They were built from real engineering experience and have been adopted by over 100,000 developers. The decision to start from Pocock was made because:
- His skills are model-agnostic โ they work with any coding agent
- They encode decades of engineering wisdom into composable, repeatable patterns
- The repository has 109k+ stars and is actively maintained
- His framework addresses the four common failure modes of AI-assisted development: misalignment, verbosity, broken code, and architectural rot
The Three-Layer Convergent Capability Stack
The architecture is organized as three distinct domains, each with a specific scope and participant:
Layer Descriptions
| Layer | Role | Key Participants |
|---|---|---|
| Human Domain | Defines what to build and why. Provides business context, domain expertise, and acceptance. Does not specify how. | Client |
| Orchestration Domain | Manages the workflow: gating decisions, dispatching agents, reviewing output, signing off. The Developer is the conductor, not the builder. | Developer |
| Execution Domain | Carries out implementation work. Agents write code, run tests, file issues, and produce artifacts. The Developer escalates in when ambiguity or judgment is needed. | Agent |
The Basement โ Compounding Organizational Intelligence
The Basement is the shared knowledge infrastructure that grows with every project. It is the organizational memory that prevents repeated mistakes and accelerates future work:
Every session updates the Basement. Terms are added to CONTEXT.md as they are resolved. ADRs are created when decisions are hard to reverse, surprising without context, or the result of a real trade-off. Rejected enhancements go into .out-of-scope/ so they are never reconsidered blindly. Retros feed back into how the workflow itself improves.
Core Philosophy: "Agent as Participant, Not Processing Stage"
The agent is not a passive tool that processes human instructions. The agent is a participant in the development process โ it asks questions, proposes alternatives, challenges assumptions, and takes initiative within its domain. The Developer's job shifts from doing the work to orchestrating the work โ setting direction, making judgment calls, and reviewing outcomes.
This philosophy has concrete implications:
- Agents grill back โ they challenge ambiguous requirements, sharpening fuzzy language
- Agents propose โ they suggest architectures, slice plans, and test strategies for human approval
- Agents discover โ when they find a better approach, they file a proposal PR rather than silently diverging
- Agents reflect โ the build session is agent-only; the Developer only gets involved on escalation
- Humans gate โ no merge without review, no slice without approval, no release without sign-off
2. The 7-Session Lifecycle
Every feature (except those routed through triage โ see ยง4) passes through seven sequential sessions. Each session has defined participants, inputs, outputs, and a specific agent role.
| # | Session | Participants | Agent Role | Gate |
|---|---|---|---|---|
| 1 | Shared Language | Client Dev Agent | Glossary builder, term challenger | CONTEXT.md updated |
| 2 | Requirement Grilling | Client Dev Agent | Investigator, scenario stress-tester | All branches resolved |
| 3 | PRD Formation | Dev Agent | PRD drafter | Developer sign-off |
| 4 | Architecture & Slicing | Dev Agent | Architecture proposer, slice list creator | Slice list approved |
| 5 | Build | Agent | Implementer (autonomous) | Developer on escalation |
| 6 | Code Review | Dev Agent | Review preparer, revision loop | Human review gate |
| 7 | Ship & Retro | Dev Agent | Test designer, report producer | Release tag = sign-off |
Session 1: Shared Language
Client Developer Agent
Purpose
Establish the domain vocabulary before any requirements are discussed. The Client, Developer, and Agent collaboratively build a shared glossary in CONTEXT.md. This session prevents the most expensive type of miscommunication โ using the same word to mean different things.
Input
Initial problem statement from the Client. Existing CONTEXT.md if any.
Output
Updated CONTEXT.md with precise definitions for all domain terms. Client and Developer agree on the vocabulary.
Agent Role
The agent actively challenges terminology. When the Client uses a term that conflicts with existing glossary entries, the agent calls it out immediately. When the Client uses vague or overloaded terms ("account", "user", "item"), the agent proposes precise canonical alternatives. The agent probes edge cases with concrete scenarios to sharpen boundaries between concepts.
What this looks like in practice
Agent: "You just said 'cancel the order,' but CONTEXT.md defines cancellation as irreversible order voiding. A moment ago you mentioned 'pausing a subscription' โ is cancellation the same as pausing, or are those separate concepts?"
Session 2: Requirement Grilling
Client Developer Agent
Purpose
Stress-test every aspect of the plan against the shared domain model. The agent interviews the Client relentlessly, walking down every branch of the design tree, resolving dependencies between decisions one-by-one.
Input
The shared glossary from S1. The Client's initial feature description.
Output
All decision branches resolved. Updated CONTEXT.md (new terms captured inline). Zero ambiguity remaining โ every question the agent could ask has been asked.
Agent Role
The agent acts as an investigator. It asks one question at a time, waiting for feedback on each before continuing. For each question, it provides a recommended answer. If a question can be answered by exploring the codebase, the agent explores instead of asking. The agent cross-references stated behavior against actual code, surfacing contradictions.
Two key techniques
1. Scenario stress-testing: The agent invents specific scenarios that probe edge cases and force precision about boundaries between concepts.
2. Code cross-referencing: When the Client states how something works, the agent checks whether the code agrees: "Your code cancels entire Orders, but you just said partial cancellation is possible โ which is right?"
Session 3: PRD Formation
Developer Agent (+ Client only if needed)
Purpose
Synthesize the grilling conversation and shared language into a formal Product Requirements Document. The agent writes the first draft; the Developer reviews, scopes, and signs off.
Input
Resolved decision tree from S2. Updated CONTEXT.md. Codebase state.
Output
Published PRD issue on the tracker with: Problem Statement, Solution, User Stories, Implementation Decisions (module sketches, architecture notes), Testing Decisions, Out of Scope, Further Notes. Labeled ready-for-agent.
Agent Role
The agent drafts the PRD by synthesizing what it already knows โ no further interviewing. It explores the codebase to understand current state, sketches major modules, identifies opportunities for deep modules (small interface, powerful implementation), and checks module expectations with the Developer before writing the final PRD.
Q2 Resolution (May 25, 2026)
Decision: Agent writes the first PRD draft. Developer reviews, scopes, and signs off. The agent does not interview further โ it synthesizes what's already known.
Session 4: Architecture & Slicing
Developer Agent
Purpose
There's an entire section on Slicing in Section 7: The Skill of Slicing. Check it out. Translate the PRD into a concrete implementation plan: architecture decisions, module design, and โ most critically โ a vertical slice breakdown of the work into independently implementable units.
Input
Approved PRD. Existing ADRs. Codebase architecture.
Output
Approved slice list with each slice showing: title, type (AFK or HITL), dependencies (blocked by), user stories covered. Published issues on the tracker.
Agent Role
The agent proposes the architecture and slice list. It presents the breakdown as a numbered list and quizzes the Developer on granularity, dependency relationships, and AFK/HITL classification. The agent iterates until the Developer approves, then publishes issues in dependency order.
Q5a & Q5b Resolutions (May 26, 2026)
Q5a: Agent proposes the slice list. Human validates priorities and coupling, then approves.
Q5b: Human dispatches. Agent offers maximum recommendations and is available for consultation โ but the dispatch decision rests with the Developer.
Session 5: Build
Agent (Developer on escalation)
Purpose
Implementation work. Agents build autonomously from the approved slices. The Developer is not involved unless an agent escalates.
Input
Approved, published issues from S4. Existing codebase.
Output
Pull requests with implemented code, following test-driven development (where applicable) and respecting all ADRs and glossary terms.
Agent Role
Agents work autonomously. One agent per AFK issue, dispatched in waves by module affinity (Q3 resolution). Agents use vertical tracer bullet slices โ one end-to-end path at a time โ rather than building horizontal layers. No merge without human review (Q4a). No SLA on review response โ agent waits for reviewer (Q4d).
Q3 Resolution (May 26, 2026)
Decision: 1 agent per AFK issue. Dispatched in waves by module affinity โ agents are assigned to issues that touch modules they've already worked on, reducing context-switch overhead.
Session 6: Code Review
Developer Agent
Purpose
Human-gated, AI-assisted code review. No code enters the main branch without human approval. The agent prepares the review โ it does not replace the human.
Input
Pull requests from S5.
Output
Approved (or rejected-with-feedback) PRs. If rejected, agent revises and resubmits (Pocock's review loop).
Agent Role
The agent prepares the review: it checks for behavior not asked for (scope creep = spec failure per Pocock), verifies ADR compliance, and flags issues. The agent presents its findings to the Developer, who makes the final call. On review failure, the agent enters a revise-and-resubmit loop (Q4c: Pocock's loop).
Q4a-Q4d Resolutions (May 26, 2026)
Q4a: No merge without human review. Non-negotiable.
Q4b: Human-gated, AI-assisted review. Agent preps the review findings; no false alarms (agent must be confident before flagging).
Q4c: Pocock's loop โ agent revises and resubmits on review failure. Cycle repeats until human approves.
Q4d: No SLA. Agent waits for reviewer. Build queue may back up; that's acceptable.
Session 7: Ship & Retrospective
Developer Agent
Purpose
Final verification, release, and learning capture. The agent designs and runs E2E tests, files PRs for test gaps, and produces a retrospective report. The Developer tests independently and signs off.
Input
Approved PRs from S6. Feature complete codebase.
Output
Release tag. E2E test reports. Retrospective document with recommendations for future work.
Agent Role
The agent designs E2E tests (browser-based for web apps), files PRs for any gaps found, and produces a report with recommendations. The Developer tests independently โ the release tag is the sign-off (Q7b).
Q7a & Q7b Resolutions (May 28, 2026)
Q7a: Agent designs E2E tests, files PRs for gaps, produces a report + recommendations. Human tests independently.
Q7b: Release tag = sign-off. No separate approval ceremony. If the Developer tags it, it ships.
3. Q1โQ8 Resolutions
All decisions made across the May 25โ29, 2026 sessions. Each resolution documents a specific design decision for the workflow itself.
Q1: Grill & Lexicon Combined Resolved May 25
Decision: Session 1 (Shared Language) and Session 2 (Requirement Grilling) are separate sessions conceptually but may be combined in practice. Multi-session as needed โ if the scope is large, these sessions can span multiple meetings. The shared glossary always comes first conceptually, even if it happens in the same conversation.
Q2: PRD Drafting Resolved May 25
Decision: Agent writes the first PRD draft. Developer reviews, scopes, and signs off. The agent does not conduct additional interviews โ it synthesizes from what's already known from Sessions 1 and 2.
Q3: Agent Dispatch Model Resolved May 26
Decision: 1 agent per AFK issue. Agents are dispatched in waves by module affinity โ an agent that worked on the billing module gets the next billing-related issue, minimizing context-switch overhead.
Q4a: No Merge Without Human Review Resolved May 26
Decision: Non-negotiable. Every PR must be reviewed by a human before merging. No automated merge paths.
Q4b: Human-Gated, AI-Assisted Review Resolved May 26
Decision: The agent prepares the review โ checks for scope creep, ADR compliance, test coverage. It presents findings to the Developer. No false alarms: the agent must be confident before flagging an issue.
Q4c: Pocock's Review Loop Resolved May 26
Decision: On review failure, the agent revises and resubmits โ exactly as Pocock defines it. The loop repeats until the Developer approves.
Q4d: No SLA on Review Resolved May 26
Decision: No service-level agreement. The agent waits for the reviewer. Build queue may back up; that's acceptable. Prioritizes review quality over throughput.
Q5a: Slice Validation Resolved May 26
Decision: Agent proposes the slice list. Human validates priorities and coupling relationships, then approves. The agent does not have authority to start building without slice approval.
Q5b: Human Dispatch Resolved May 26
Decision: Human dispatches the work. Agent offers maximum recommendations (which agent should take which slice) and is available for consultation, but the dispatch decision rests with the Developer.
Q6: Better Approach Discovered Resolved May 27
Decision: If an agent discovers a better approach during implementation, it files a proposal PR. For refactoring with no behavior change โ just do it (no proposal needed). For any change that alters behavior, architecture, or interface โ file a proposal PR for human review.
Q7a: E2E Test Design Resolved May 28
Decision: Agent designs E2E tests (browser-based for web applications), files PRs for any gaps it discovers, produces a report with recommendations. The Developer tests independently alongside the agent's work.
Q7b: Release Tag = Sign-Off Resolved May 28
Decision: No separate approval ceremony or sign-off meeting. When the Developer creates the release tag, that is the sign-off. If it's tagged, it ships.
Q8: Triage Router Resolved May 29
Decision: Bugs and small features enter through the triage router and skip the full 7-session lifecycle. Small items route through 3 of 7 sessions (Triage โ Build โ Review). Large features go through all 7. The triage router determines the path.
4. Triage Router
The Triage Router is the entry point for all incoming work โ bugs, small enhancements, and feature requests. It determines whether an item needs the full 7-session lifecycle or can take an accelerated path.
Bug / Enhancement / Feature"] --> TRIAGE["๐ Triage Session"] TRIAGE --> CAT{Bug or Enhancement?} CAT -->|Bug| REPRO["Reproduce
(Pocock: read steps, trace code, run tests)"] CAT -->|Enhancement| SIZE{"Is it small?"} REPRO --> REPRO_OK{Reproduced?} REPRO_OK -->|Yes| GRILL_T{"Is full spec needed?"} REPRO_OK -->|No| NEEDS_INFO["needs-info
Post triage notes"] GRILL_T -->|No| FAST_PATH["๐ Fast Path (3 sessions)
Triage โ Build โ Review"] GRILL_T -->|Yes| GRILL["Grill Session
(S2 equivalent)"] SIZE -->|Yes, small| FAST_PATH SIZE -->|No, new feature| FULL_PATH["๐๏ธ Full Path (7 sessions)
S1 โ S2 โ S3 โ S4 โ S5 โ S6 โ S7"] GRILL --> READY_AGENT["ready-for-agent
Agent brief posted"] READY_AGENT --> FAST_PATH NEEDS_INFO --> WAIT{"Reporter responds?"} WAIT -->|Yes| TRIAGE WAIT -->|No| WONTFIX["wontfix โ Close"] style TRIAGE fill:#ffe0b2,stroke:#1a1a1a,stroke-width:3px style FAST_PATH fill:#b2ffb2,stroke:#1a1a1a,stroke-width:3px style FULL_PATH fill:#b3d9ff,stroke:#1a1a1a,stroke-width:3px style WONTFIX fill:#ffb3b3,stroke:#1a1a1a,stroke-width:3px style NEEDS_INFO fill:#ffb3b3,stroke:#1a1a1a,stroke-width:3px
Triage State Machine
| State | Category | Description |
|---|---|---|
| needs-triage | State | Maintainer needs to evaluate the issue |
| needs-info | State | Waiting on reporter for more information |
| ready-for-agent | State | Fully specified, ready for AFK agent to pick up |
| ready-for-human | State | Needs human implementation (judgment calls, external access) |
| wontfix | State | Will not be actioned (rejected with rationale) |
| bug | Category | Something is broken |
| enhancement | Category | New feature or improvement |
Fast Path (3 Sessions)
Small, well-understood items skip the full lifecycle:
- Triage โ route, categorize, write agent brief
- Build โ agent implements autonomously
- Review โ human gates the merge
No shared language session, no grilling session (unless triage determines one is needed), no PRD, no architecture slicing. The triage notes serve as the specification.
5. Full Pipeline
The complete workflow from incoming work to shipped feature. This diagram shows every gateway, every exit, and every possible path through the system.
(Agent Only)"] F_BUILD --> F_REVIEW["Review
(Human Gate)"] end subgraph FULL_PATH["Full Path (7 Sessions)"] FULL --> S1_SHARED["S1: Shared Language
Client + Dev + Agent"] S1_SHARED --> S2_GRILL["S2: Requirement Grilling
Client + Dev + Agent"] S2_GRILL --> S3_PRD["S3: PRD Formation
Dev + Agent"] S3_PRD --> S4_ARCH["S4: Architecture & Slicing
Dev + Agent"] S4_ARCH --> S5_BUILD["S5: Build
Agent Only"] S5_BUILD --> S6_REVIEW["S6: Code Review
Dev + Agent"] S6_REVIEW --> S7_SHIP["S7: Ship & Retro
Dev + Agent"] S6_REVIEW -. "revise" .-> S5_BUILD end F_REVIEW --> DEPLOY["๐ Deploy"] S7_SHIP --> DEPLOY DEPLOY --> RETRO_LOG["๐ Retro Log
โ Basement"] DEPLOY --> RELEASE_TAG["๐ท๏ธ Release Tagged"] style ENTRY fill:#ffe0b2,stroke:#1a1a1a,stroke-width:3px style FAST_PATH fill:#b2ffb2,stroke:#1a1a1a,stroke-width:3px style FULL_PATH fill:#b3d9ff,stroke:#1a1a1a,stroke-width:3px style DEPLOY fill:#d9b3ff,stroke:#1a1a1a,stroke-width:3px
Pipeline Decision Points
Triage Router
Determines fast vs. full path. Bugs get reproduction attempt first (Pocock rule). Small enhancements with clear spec skip ahead.
PRD Sign-Off
Developer reviews and approves the agent-drafted PRD. Without this, no architecture work begins.
Slice Approval
Developer validates the agent's proposed slice list โ priorities, dependency coupling, AFK/HITL split.
Dispatch Decision
Human dispatches agents to slices. Agent recommends, Developer decides.
Code Review
No merge without human review. Agent preps the review; Developer makes the call. Revise loop on rejection.
Release Tag
Release tag = sign-off. No separate ceremony. The Basement is updated with retro findings.
Basement Updates Throughout
6. Pocock Quotes
Direct quotations from Matt Pocock's skill documentation. These principles underpin the Convergent workflow.
"Break the plan into tracer bullet issues โ thin vertical slices that cut through ALL integration layers"
โ to-issues skill, from mattpocock/skills
This is the foundation of Session 4 (Architecture & Slicing) and Session 5 (Build). Each slice delivers a complete, demoable end-to-end path through schema, API, UI, and tests. Prefer many thin slices over few thick ones. Prefer AFK (autonomous) slices over HITL (human-in-the-loop).
"Reproduce (bugs only). Before any grilling, attempt reproduction."
โ triage skill, from mattpocock/skills
In Convergent's triage router (Q8), the very first step for bugs is reproduction: read the reporter's steps, trace the relevant code, run tests or commands. A confirmed repro makes a much stronger agent brief. If reproduction fails, the issue goes to needs-info.
"Behaviour in the diff that wasn't asked for (scope creep) is a Spec failure"
โ review (code review conventions), from mattpocock/skills
This is the primary check the agent performs during Session 6 (Code Review). The agent scans each PR for behavior that wasn't in the spec. If found, it flags it as a spec failure โ not a minor deviation. The PR must be revised.
"Challenge against the glossary โ when the user uses a term that conflicts with the language in CONTEXT.md, call it out immediately"
โ grill-with-docs skill, from mattpocock/skills
This is the agent's primary behavior in Session 1 (Shared Language) and Session 2 (Requirement Grilling). Every term conflict is surfaced immediately. The agent also sharpens fuzzy language and probes edge cases with concrete scenarios.
"When the user uses a vague or overloaded term, propose a precise canonical term. 'You're saying account โ do you mean the Customer or the User? Those are different things.'"
โ grill-with-docs skill, from mattpocock/skills
"Only offer to create an ADR when all three are true: (1) Hard to reverse, (2) Surprising without context, (3) The result of a real trade-off."
โ grill-with-docs skill, from mattpocock/skills
Convergent applies this rule strictly: ADRs are not created for every decision. If the cost of reversing is low, if the rationale is obvious, or if there were no real alternatives โ skip the ADR.
"A deep module (as opposed to a shallow module) is one which encapsulates a lot of functionality in a simple, testable interface which rarely changes."
โ to-prd skill, from mattpocock/skills
In Session 3 (PRD Formation), the agent actively identifies opportunities for deep modules during the architecture sketch phase. The Developer validates these module proposals before the PRD is finalized.
"The rate of feedback is your speed limit."
โ Matt Pocock, quoting The Pragmatic Programmer
This principle pervades the entire Convergent workflow: fast feedback loops in Build (one test โ one implementation โ repeat), deterministic feedback loops in Diagnosis, and human feedback loops in Review.
"Build the right feedback loop, and the bug is 90% fixed. Spend disproportionate effort here. Be aggressive. Be creative. Refuse to give up."
โ diagnose skill, from mattpocock/skills
"A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower."
โ diagnose skill, from mattpocock/skills
"Do NOT write all tests first, then all implementation. This is 'horizontal slicing' โ treating RED as 'write all tests' and GREEN as 'write all code.' This produces crap tests."
โ tdd skill, from mattpocock/skills
Correct approach: vertical slices via tracer bullets. One test โ one implementation โ repeat. This is how Session 5 (Build) operates at the micro level.
"Good tests are integration-style: they exercise real code paths through public APIs. They describe WHAT the system does, not HOW it does it."
โ tdd skill, from mattpocock/skills
๐๏ธ The Basement Grows With Every Ship
Every release updates the glossary, archives new ADRs, logs rejected ideas to .out-of-scope/, and records retro findings. Over time, the organization's intelligence compounds โ each project is faster and better than the last.
7. The Skill of Slicing
๐ง "Slicing is the replacement for context that agents don't have."
A senior developer carries the whole system in their head. An agent arrives with zero knowledge and 8K tokens of instructions. Slicing bridges that gap.
Why Slicing Matters โ The Cake Analogy
Imagine you're baking a three-layer cake: sponge, cream, frosting.
"You bake all the sponge layers. Then all the cream. Then all the frosting. Then we assemble."
You won't know if the cake works until the very end. If the sponge is too dry, the whole cake is ruined and you can't taste-test anything along the way.
"Bake one complete slice: sponge โ cream โ frosting. Taste it. Then bake the next slice."
Every slice is a complete, edible piece. If slice #1 is dry, you know immediately and fix it before baking 10 more. Each slice is independently demoable.
In software: horizontal = "do all the database work, then all the API, then all the UI." Vertical = "build 'user can update email' end-to-end (DB โ API โ UI โ tests), then build 'user can reset password' end-to-end."
The Agent Without Slicing โ A Comic
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ DEVELOPER: "Build the user dashboard." โ โ โ โ AGENT (arrives with zero context): โ โ โ โ ๐ง "Ok... dashboard. I'll need: โ โ โข A new route in app.py โ โ โข A template for the sidebar โ โ โข A database migration for the analytics table โ โ โข The charting library (which one?) โ โ โข Auth middleware (how does auth work here?) โ โ โข The CSS framework (Tailwind? Bootstrap?) โ โ โข Email notification config โ โ โข ...and I should probably refactor the nav โ โ component while I'm at it" โ โ โ โ โฑ๏ธ 45 minutes later: โ โ โ โ AGENT submits a PR touching 47 files across โ โ 8 modules. The PR description says "built dashboard." โ โ โ โ DEVELOPER: ๐ค "Why did you change the email config?!" โ โ AGENT: ๐ค "I thought it was related." โ โ DEVELOPER: "It's not. Revert that. And the nav thing." โ โ AGENT: ๐ค "Ok. PR updated." โ โ DEVELOPER: "...you also broke the login page." โ โ AGENT: ๐ค "I refactored the auth middleware." โ โ DEVELOPER: ๐ค๐ค๐ค โ โ โ โ โโโ VS โโโ โ โ โ โ DEVELOPER: "Here's Slice #3: 'User sees revenue โ โ chart on dashboard.' It touches: โ โ โข analytics.py (add get_revenue()) โ โ โข dashboard.jsx (add RevenueChart) โ โ โข test_analytics.py (3 test cases) โ โ Blocked by: Slice #1 (auth middleware). โ โ Type: AFK. Go." โ โ โ โ AGENT: ๐ง "Revenue chart. analytics.py โ โ โ dashboard.jsx. Three test cases. โ โ Got it." โ โ โ โ โฑ๏ธ 8 minutes later: โ โ โ โ AGENT submits a PR touching 3 files. โ โ All tests pass. No surprises. โ โ โ โ DEVELOPER: โ "Merged. Next slice." โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Deep vs. Shallow โ Why Module Design Matters for Slicing
Pocock's to-prd skill draws on John Ousterhout's A Philosophy of Software Design:
"A deep module encapsulates a lot of functionality in a simple, testable interface which rarely changes."
โ to-prd skill
This distinction is everything for AI agents. Deep modules = easy to slice. Shallow modules = agent chaos.
How AI Thinks About Our Software Architecture
The AI has to memorize call orders.
Magic column indices like
row[3].All SQL and filtering is hidden inside.
One clear entry point โ right data.
The agent can't get the deep module wrong. The shallow module, it definitely will.
Visual Structure Comparison
Returns raw SQL rows"] F["filter_by_region(rows)
Needs rows from get_raw"] C["calculate_total(rows)
Needs rows from filter"] M["format_currency(amt)
Call after calculate"] G -.->|"โ ๏ธ Call first"| F F -.->|"โ ๏ธ Call second"| C C -.->|"โ ๏ธ Call third"| M style G fill:#fee2e2,stroke:#dc2626,stroke-width:2px style F fill:#fee2e2,stroke:#dc2626,stroke-width:2px style C fill:#fee2e2,stroke:#dc2626,stroke-width:2px style M fill:#fee2e2,stroke:#dc2626,stroke-width:2px
The agent must figure out:
โข Which function to call first
โข What each function needs as input
โข That
row[3] means regionโข That
row[5] means amount
The agent just does:
โข
r = SalesReport(start, end, "APAC")โข
r.formatted_total()โข
r.by_product()No call order. No magic numbers.
The shallow module chains four functions with invisible rules. The deep module wraps everything in one class โ the agent can't mess it up.
โ Shallow Module โ The Agent's Nightmare
# shallow_report.py โ Shallow: lots of tiny functions, # each one exposes internal details def get_raw_sales_data(start_date, end_date): """Returns raw SQL rows. Caller must know schema.""" return db.execute( "SELECT * FROM sales WHERE date BETWEEN ? AND ?", [start_date, end_date] ).fetchall() # โ raw SQL rows leak out def filter_by_region(rows, region): """Caller must have called get_raw_sales_data first.""" return [r for r in rows if r[3] == region] # โ magic index 3 def calculate_total(rows): """Caller must know column index 5 is amount.""" return sum(r[5] for r in rows) # โ magic index 5 def format_currency(amount): """Caller must remember to call this before display.""" return f"${amount:,.2f}" # The agent has to know: the call order, the column indices, # which functions pair with which, what to call before what. # That's 4 functions with 4 pieces of hidden knowledge. # The agent WILL get this wrong.
โ Deep Module โ The Agent's Friend
# deep_report.py โ Deep: one public function, # simple interface, complex internals hidden class SalesReport: """Sales data for a given period and region.""" def __init__(self, start_date, end_date, region=None): self._rows = db.execute( "SELECT date, product, amount FROM sales WHERE date BETWEEN ? AND ?", [start_date, end_date] ).fetchall() if region: self._rows = [r for r in self._rows if r.region == region] def total(self): return sum(r.amount for r in self._rows) def formatted_total(self): return f"${self.total():,.2f}" def by_product(self): """Returns {product_name: total_amount} dict.""" result = {} for r in self._rows: result[r.product] = result.get(r.product, 0) + r.amount return result # The agent just does: # report = SalesReport("2026-01-01", "2026-06-30", "APAC") # total = report.formatted_total() # breakdown = report.by_product() # # One interface, everything works. No call order to memorize. # No magic indices. No hidden knowledge. # The agent can't get this wrong.
The Slicing Decision Flow
and codebase"] SCAN --> FIND["Finds natural seams:
where does one feature end
and another begin?"] FIND --> TRACE["Traces dependency chain:
what must exist before
what can be built?"] TRACE --> VERT["Cuts vertically
(DB โ API โ UI โ tests)
NOT horizontally"] VERT --> CLASS["Classifies each slice:
AFK or HITL?"] CLASS --> PROPOSE["Proposes numbered list
with title, type, blockers"] PROPOSE --> QUIZ["Quizzes Developer:
granularity? dependencies?
AFK/HITL correct?"] QUIZ -->|"Revise"| FIND QUIZ -->|"Approved"|PUB["Publishes issues
in dependency order"] style PRD fill:#fff9d6,stroke:#1a1a1a,stroke-width:3px style PUB fill:#dcfce7,stroke:#16a34a,stroke-width:3px style CLASS fill:#ffe0b2,stroke:#d97706,stroke-width:2px
The Five Rules of Slicing for AI Agents
| # | Rule | Why It Matters |
|---|---|---|
| 1 | Each slice touches every layer | Vertical, not horizontal. Schema โ API โ UI โ tests in one slice. If you can't demo it, it's not a slice. |
| 2 | Narrow scope, limited blast radius | If the agent hallucinates, the damage is contained to one slice. A bad 3-file PR is fixable. A bad 47-file PR is a rewrite. |
| 3 | Explicit blockers, never implicit | Agents don't infer dependencies. "Blocked by #3" must be explicit. If you think an agent will "figure out" the dependency, it won't. |
| 4 | Prefer AFK over HITL | Every HITL slice is a human bottleneck. Challenge: "Does this REALLY need a human decision, or can we be more specific in the spec?" |
| 5 | Wave-dispatch by module affinity | Two slices touching the same module go in different waves. An agent that built auth.py gets the next auth-related slice. Module affinity reduces merge conflicts. |
The Tracer Bullet โ Where the Metaphor Comes From
Pocock didn't invent the term. It's from The Pragmatic Programmer by Andy Hunt and Dave Thomas (1999):
"Tracer bullets are loaded at intervals alongside regular ammunition. When fired, they burn phosphorus โ you can see where your shots are going in the dark. In software, a tracer bullet is a thin, vertical implementation that touches every layer of the system. It shows you the path before you commit to building the whole thing."
โ The Pragmatic Programmer (paraphrased for the slicing context)
In our workflow: each slice is one tracer bullet. It proves the path works (schema โ API โ UI โ tests) for ONE specific feature. After the tracer proves the path exists, the rest of the feature fills in around it. This is why we build thin complete slices first โ not thick horizontal layers.