Your Bash Script Isn't Slow — Your Subprocesses Are

A Claude Code startup hook took 5.2 seconds. The actual work took 0.46 seconds. The other 4.7 seconds was the cost of asking bash to do that work in subshells.

2026-01-23 // RAW LEARNING CAPTURE

PROJECTWORKSPACE

I had a startup hook for Claude Code that scanned my ~/Code directory — 121 projects, detect languages, pull descriptions from package.json, output a manifest. It took 5.2 seconds. I assumed the work itself was expensive: 93 git log calls, 15 node -e invocations, hundreds of file checks. I profiled the individual operations, added up the costs, and the numbers didn't explain the total. The math was wrong because I was measuring the wrong thing.

The misleading profile

The hook runs at session start, blocking Claude from responding until it finishes. My first instinct was to profile what the script does:

Operation	Time
93 `git log` calls	~1.0s
15 `node -e` calls	~0.4s
File existence checks	negligible

That accounts for maybe 1.5 seconds. The script takes 5.2. Where's the gap?

Inlining reveals the truth

I rewrote the script with one constraint: no command substitution. Every $(basename ...) became parameter expansion. Every $(stat ...) became a single bulk call. Every function invocation that captured stdout became inline logic. Same computation, same output, zero subshells.

Section 1 (scan+stat):     0.181s
Section 2 (sort):          0.057s
Section 3 (top 15 output): 0.026s
Section 4 (ref-templates): 0.149s
Section 5 (heredoc):       0.047s
Total:                     0.461s

The actual work — reading files, checking git repos, building the manifest — is 0.46 seconds. The original script spent 4.7 seconds asking for that work to be done. Each $(...) in bash forks a subprocess: fork(), set up pipes, exec() the command, capture stdout, clean up. On macOS, that costs 20-40ms per invocation. The original script spawned over 300 subprocesses across 121 project directories.

The expensive patterns

Three patterns accounted for nearly all the overhead:

basename in a loop. Every iteration called $(basename "$dir") — a fork+exec to strip a path prefix that parameter expansion handles as a builtin:

# 121 subprocesses (~2.4s)
name=$(basename "$dir")

# Zero subprocesses
name="${dir%/}"
name="${name##*/}"

node -e for JSON parsing. Each invocation pays Node's ~27ms cold start to extract a single field. Bash regex against the raw file content is ugly but instantaneous:

# 15 Node.js cold starts (~0.4s)
desc=$(node -e "const p=require('$dir/package.json'); console.log(p.description)")

# Zero subprocesses
pkg_content=$(< "${dir}package.json")
[[ "$pkg_content" =~ \"description\"[[:space:]]*:[[:space:]]*\"([^\"]+)\" ]] && desc="${BASH_REMATCH[1]}"

Individual stat calls. One subprocess per file, when stat happily accepts an array of paths:

# 93 subprocesses (~1.9s)
for dir in ...; do mtime=$(stat -f %m "$dir/.git/index"); done

# 1 subprocess
stat -f "%m %N" "${stat_paths[@]}"

The bash 3.2 constraint

My first attempt at the bulk stat approach used associative arrays (declare -A) to map paths back to project names. macOS ships bash 3.2 — no associative arrays, no readarray, no mapfile. The fix was simpler anyway: extract the project name from the stat output path using parameter expansion, no mapping needed.

name="${path#$CODE_DIR/}"
name="${name%%/*}"

This is a recurring theme with macOS shell scripts. The version Apple ships is frozen at 3.2 due to GPLv3 licensing. Any script that needs to run without Homebrew bash has to work within those constraints.

The result

# Before
5.165 total (1.80s user, 2.88s system)

# After
0.662 total (0.22s user, 0.38s system)

Same output. Same manifest. 7.7x faster. Claude Code startup from ~/Code dropped from 10.1 seconds to 4.4 seconds — the remaining time is the API handshake and CLAUDE.md parsing, neither of which I control.

The pattern

The insight isn't "use parameter expansion" or "avoid node -e." It's that bash subprocess overhead is invisible until you measure it in aggregate. Each $(...) looks cheap in isolation — 25ms, who cares? But compound it across a loop body that runs 100+ times, and fork/exec dominates your runtime by an order of magnitude.

The diagnostic technique: take your slow script, inline everything (eliminate all command substitution), and time it again. If the inlined version is dramatically faster, your bottleneck isn't computation — it's coordination. The shell is spending more time setting up work than doing it.

Pattern	Cost per call	At 100 iterations
`$(basename ...)`	~20ms	2.0s
`$(stat ...)`	~20ms	2.0s
`$(node -e ...)`	~27ms	2.7s
`$(func_name)` (subshell capture)	~20ms	2.0s
Parameter expansion	~0ms	~0ms
Bulk `stat` (1 call for N files)	~20ms	0.02s

When your loop body spawns three subprocesses per iteration across 100 directories, you're paying 6 seconds of overhead for 0.5 seconds of work. The fix is always the same: push the fork boundary outside the loop.

LOG.ENTRY_END

ref:workspace

RAW