Your Profiler Has a Blind Spot

Shell startup took 557ms. zprof said compinit was the problem — but the real bottleneck lived in a file zprof can't see.

2026-01-05 // RAW LEARNING CAPTURE

PROJECTWORKSPACE

Every zsh optimization guide starts the same way: run zprof, find the slow function, fix it. I followed that playbook and got a 12x speedup in an isolated test. Then I applied it to my real config and nothing changed. The profiler was right about what was slow — but wrong about where to look.

The Obvious Problem

The baseline was bad. 557ms to start a shell, which meant nearly a full second from opening Ghostty to typing a command.

hyperfine 'zsh -i -c exit'
# Time (mean +/- σ): 557.3 ms +/- 31.5 ms

zprof pointed directly at compinit — the completion system initializer. It was spending 1652ms loading completions, calling compdef 997 times across 19 Oh My Zsh plugins. Most of those plugins existed because they were there when I first set up the machine, not because I used them.

num  calls                time                       self            name
-----------------------------------------------------------------------------------
 1)  997         587.80     0.59   33.10%    587.80     0.59   33.10%  compdef
 2)    1        1652.17  1652.17   93.03%    559.22   559.22   31.49%  compinit

The fix seemed straightforward: drop Oh My Zsh, load the three plugins I actually use (autosuggestions, syntax highlighting, z), and let Starship handle the prompt. I tested this in isolation using ZDOTDIR to point zsh at a temporary config directory — a trick that lets you benchmark configs without touching your real dotfiles:

Baseline (19 plugins):   557 ms
Trimmed OMZ (6 plugins):  76 ms
No OMZ + Starship:        45 ms
Nuclear (no plugins):     9.5 ms

Starship at 45ms felt right — fast enough that you'd never notice, with autosuggestions and syntax highlighting intact. I applied it.

The Profiler Lied

hyperfine --warmup 2 --runs 5 'zsh -i -c exit'
# Time (mean +/- σ): 524.2 ms +/- 31.5 ms

524ms. Barely faster than the original. I'd confirmed the isolated config ran at 45ms. Something was loading outside .zshrc.

I ran zprof again. It reported 21ms of work. But wall-clock was 524ms. That 500ms gap meant the slowness was happening before zprof could see it.

Zsh loads files in a specific order: .zshenv runs first, for every shell — interactive, non-interactive, login, script. Then .zprofile for login shells. Then .zshrc for interactive shells. zprof only instruments what happens during .zshrc. If your bottleneck lives in .zshenv, zprof will happily report "all clear" while you stare at a half-second delay.

Loading diagram...

The Real Bottleneck

cat ~/.zshenv

export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
. "$HOME/.cargo/env"

NVM's installer puts its loader in .zshenv by default. That loader takes 137ms — not catastrophic on its own, but it runs for every shell invocation, including non-interactive subshells that will never use Node. And because it ran before .zshrc, all the lazy-loading tricks in the world couldn't help.

The fix was two lines: remove the NVM eager-load from .zshenv, keep only the fast Cargo PATH export. NVM gets lazy-loaded in .zshrc if you ever need it directly.

# ~/.zshenv — keep this minimal
. "$HOME/.cargo/env"

Result: 43ms. The 12x speedup that the isolated test promised, now real.

The Node Manager Problem

Removing eager NVM loading broke global npm packages. Binaries like e2b live inside ~/.nvm/versions/node/v22.20.0/bin/, a path that only exists after NVM initializes. With lazy loading, that never happens unless you explicitly run node first.

The deeper fix: replace NVM with fnm. Written in Rust, fnm initializes in 2.6ms — fast enough to load eagerly without impacting startup. That's 52x faster than NVM's 137ms, which means you can skip the lazy-loading dance entirely.

eval "$(fnm env --shell zsh)"

The final timing landed at 76ms — fnm adds about 30ms over the absolute minimum, but global packages work without workarounds.

The Completion Cache

The other piece worth keeping: compinit doesn't need to rebuild its completion database on every shell start. The -C flag skips the security audit and reuses the cached .zcompdump file. On a personal machine, this is safe and saves roughly 100ms. Rebuilding once per day is plenty:

autoload -Uz compinit
() {
  local zcompdump="${ZDOTDIR:-$HOME}/.zcompdump"
  if [[ -f "$zcompdump" && $(date +'%j') == $(stat -f '%Sm' -t '%j' "$zcompdump" 2>/dev/null) ]]; then
    compinit -C -d "$zcompdump"
  else
    compinit -d "$zcompdump"
  fi
}

What Frameworks Hide

Dropping Oh My Zsh exposed a second category of invisible work: keybindings and shell integration. Up-arrow history search, Option+Delete for word deletion, directory reporting for new terminal tabs — OMZ provided all of these silently. Without it, you need explicit bindkey calls and an OSC 7 hook to tell Ghostty which directory to inherit.

This is the trade-off of framework removal. You trade startup cost for configuration awareness. Every convenience the framework provided becomes something you either replicate manually or decide you don't need.

The Pattern

Layer	What Hid the Cost
`.zshenv` NVM load	zprof's scope (only profiles `.zshrc`)
997 compdef calls	Plugin list inertia (never audited)
Eager compinit	Missing `-C` flag (cache exists, unused)
NVM itself	"Standard" tool (slow assumed normal)
OMZ keybindings	Invisible until removed

The shell optimization story has a surface reading — "remove stuff, go faster" — but the interesting part is about measurement scope. zprof is a good tool with a blind spot: it only sees .zshrc. NVM is slow but its real cost is placement in .zshenv, not its absolute runtime. The lesson generalizes beyond shells: if your profiler reports "fast" but the system is slow, the bottleneck is in a phase the profiler doesn't instrument.

Metric	Before	After
Shell startup	557ms	76ms
Profiler-visible cost	1652ms	21ms
Profiler-invisible cost	~500ms	~3ms
Node manager	NVM (137ms)	fnm (2.6ms)

LOG.ENTRY_END

ref:workspace

RAW