- 01.The Shape: What Are We Actually Caching?
- 02.Commands vs. Skills: The Invocation Design Space
- 03.The Devlog Problem: Categories Kill Chronology
- 04.Starting Point
- 05.{First thing we tried}
- 06.{Next thing that happened}
- 07.Where We Landed
- 08.The Meta-Pattern
Caching Understanding Is a Design Problem, Not a Storage Problem
Building a /trace command exposed the real bottleneck in AI-assisted development: not exploration speed, but the absence of persistent architectural memory between sessions.
Every new Claude Code session starts the same way. The model reads your CLAUDE.md, maybe greps a few files, spawns an explore agent -- and ten minutes later it has a working understanding of your codebase. Then the session ends. Next time, it starts from zero.
The problem isn't that exploration is slow. It's that understanding evaporates. And fixing that turned out to require more design thinking than implementation work.
The Shape: What Are We Actually Caching?
I ran /shape on the idea: a command that persists codebase understanding across sessions. The shaping process itself surfaced the interesting design questions -- not "where do we store it" but "what counts as understanding?"
Six decisions crystallized through the shaping phase:
- Appetite: Medium bet. One to two days, not a week.
- Staleness: No automatic detection. Explicit refresh only -- the user knows when architecture changes.
- Scope: Global command (
~/.claude/commands/), not project-specific. - Orchestration: Heuristic agent spawning. Opus assesses complexity and decides how many explore agents to fan out.
- Format: Mermaid diagrams as the primary representation -- spatial relationships compress better than prose for architecture.
- Interaction model: Fire-and-forget. Run
/trace, walk away, come back to a.traces/default.mdfile ready for loading.
The staleness decision is the interesting one. I initially assumed you'd want the trace to detect drift -- compare file mtimes, check git log, flag sections that might be stale. But that's solving a problem that doesn't exist yet. If architecture changed enough to invalidate a trace, the developer knows. --update when you need it, not automatic churn.
Commands vs. Skills: The Invocation Design Space
Building /trace required understanding a distinction I hadn't thought carefully about: the difference between slash commands and skills in the Claude Code ecosystem.
---
description: Cache codebase understanding as a trace file
argument-hint: [name] [--update] [--list]
allowed-tools: Bash(git:*), Read, Write, Task
---A slash command is a single markdown file. The user types /trace and the prompt fires. A skill is a directory with a SKILL.md and supporting files -- invoked automatically when Claude recognizes a matching situation.
The distinction maps to a general design principle: explicit invocation for expensive operations, automatic invocation for ambient capabilities. You don't want Claude auto-running a ten-minute codebase exploration every time you open a session. You want to choose when to pay that cost. But you do want it to automatically load an existing trace when it's relevant.
This meant /trace (the generator) is a command, but trace-loading (the consumer) lives in CLAUDE.md as a behavioral instruction: "Before exploring unfamiliar code, check .traces/ first."
The Devlog Problem: Categories Kill Chronology
With /trace working, I realized sessions needed a symmetric tool -- something to capture what happened, not just what the codebase looks like. I wrote /devlog.
The first draft organized output into rolled-up categories: Problem, Approaches Considered, What Didn't Work, Solution, Key Learnings. Clean taxonomy. Terrible for recall.
The feedback was immediate: "less rolling up into categories, more linear as it happened." Categories strip the causal chain. You lose the moment where fixing the health check revealed the fetch failure, because those get sorted into different buckets. Chronological headers preserve causation:
## Starting Point
{what we had, what we wanted}
## {First thing we tried}
{actual output/error}
{what this told us}
## {Next thing that happened}
...
## Where We Landed
{what shipped, what changed}Headers reflect what actually happened, not generic labels. "Health Check Passes But Fetches Fail" tells you more than "Problem #2." This is the same insight that makes git commit messages better as narratives than as categories -- the sequence is the explanation.
The Meta-Pattern
Building tools to cache understanding taught me something about the understanding itself. The trace command works because architectural knowledge has a specific shape: it's spatial (components and their relationships), hierarchical (layers of abstraction), and surprisingly stable (the database schema changes less often than the API handlers).
That shape determines the format. Mermaid diagrams aren't a stylistic choice -- they're the natural encoding for spatial-hierarchical knowledge. Prose is better for decisions and trade-offs. Code blocks are better for interface contracts. A good trace uses all three, each where it fits.
The /devlog command works because session knowledge has a different shape: it's temporal, causal, and volatile. You need the sequence to understand why things ended up where they did. Strip the sequence and you have a FAQ. Keep it and you have a story that future-you can actually follow.
| Tool | Knowledge shape | Format | Invocation |
|---|---|---|---|
/trace | Spatial, hierarchical, stable | Mermaid + prose + interfaces | Explicit (expensive to generate) |
/devlog | Temporal, causal, volatile | Chronological narrative | Explicit (captures session context) |
| CLAUDE.md | Declarative, prescriptive | Instructions + conventions | Automatic (always loaded) |
Three tools, three knowledge shapes, three invocation patterns. The design followed from the shape of what each one caches -- not from the storage mechanism, which is just markdown files in every case. The real design problem was never "where to put the file." It was understanding what kind of knowledge you're trying to preserve.