- 01.The Obvious Problem
- 02.The Profiler Lied
- 03.The Real Bottleneck
- 04.The Node Manager Problem
- 05.The Completion Cache
- 06.What Frameworks Hide
- 07.The Pattern
Your Profiler Has a Blind Spot
Shell startup took 557ms. zprof said compinit was the problem — but the real bottleneck lived in a file zprof can't see.
Every zsh optimization guide starts the same way: run zprof, find the slow function, fix it. I followed that playbook and got a 12x speedup in an isolated test. Then I applied it to my real config and nothing changed. The profiler was right about what was slow — but wrong about where to look.
The Obvious Problem
The baseline was bad. 557ms to start a shell, which meant nearly a full second from opening Ghostty to typing a command.
hyperfine 'zsh -i -c exit'
# Time (mean +/- σ): 557.3 ms +/- 31.5 mszprof pointed directly at compinit — the completion system initializer. It was spending 1652ms loading completions, calling compdef 997 times across 19 Oh My Zsh plugins. Most of those plugins existed because they were there when I first set up the machine, not because I used them.
num calls time self name
-----------------------------------------------------------------------------------
1) 997 587.80 0.59 33.10% 587.80 0.59 33.10% compdef
2) 1 1652.17 1652.17 93.03% 559.22 559.22 31.49% compinit
The fix seemed straightforward: drop Oh My Zsh, load the three plugins I actually use (autosuggestions, syntax highlighting, z), and let Starship handle the prompt. I tested this in isolation using ZDOTDIR to point zsh at a temporary config directory — a trick that lets you benchmark configs without touching your real dotfiles:
Baseline (19 plugins): 557 ms
Trimmed OMZ (6 plugins): 76 ms
No OMZ + Starship: 45 ms
Nuclear (no plugins): 9.5 ms
Starship at 45ms felt right — fast enough that you'd never notice, with autosuggestions and syntax highlighting intact. I applied it.
The Profiler Lied
hyperfine --warmup 2 --runs 5 'zsh -i -c exit'
# Time (mean +/- σ): 524.2 ms +/- 31.5 ms524ms. Barely faster than the original. I'd confirmed the isolated config ran at 45ms. Something was loading outside .zshrc.
I ran zprof again. It reported 21ms of work. But wall-clock was 524ms. That 500ms gap meant the slowness was happening before zprof could see it.
Zsh loads files in a specific order: .zshenv runs first, for every shell — interactive, non-interactive, login, script. Then .zprofile for login shells. Then .zshrc for interactive shells. zprof only instruments what happens during .zshrc. If your bottleneck lives in .zshenv, zprof will happily report "all clear" while you stare at a half-second delay.
The Real Bottleneck
cat ~/.zshenvexport NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
. "$HOME/.cargo/env"NVM's installer puts its loader in .zshenv by default. That loader takes 137ms — not catastrophic on its own, but it runs for every shell invocation, including non-interactive subshells that will never use Node. And because it ran before .zshrc, all the lazy-loading tricks in the world couldn't help.
The fix was two lines: remove the NVM eager-load from .zshenv, keep only the fast Cargo PATH export. NVM gets lazy-loaded in .zshrc if you ever need it directly.
# ~/.zshenv — keep this minimal
. "$HOME/.cargo/env"Result: 43ms. The 12x speedup that the isolated test promised, now real.
The Node Manager Problem
Removing eager NVM loading broke global npm packages. Binaries like e2b live inside ~/.nvm/versions/node/v22.20.0/bin/, a path that only exists after NVM initializes. With lazy loading, that never happens unless you explicitly run node first.
The deeper fix: replace NVM with fnm. Written in Rust, fnm initializes in 2.6ms — fast enough to load eagerly without impacting startup. That's 52x faster than NVM's 137ms, which means you can skip the lazy-loading dance entirely.
eval "$(fnm env --shell zsh)"The final timing landed at 76ms — fnm adds about 30ms over the absolute minimum, but global packages work without workarounds.
The Completion Cache
The other piece worth keeping: compinit doesn't need to rebuild its completion database on every shell start. The -C flag skips the security audit and reuses the cached .zcompdump file. On a personal machine, this is safe and saves roughly 100ms. Rebuilding once per day is plenty:
autoload -Uz compinit
() {
local zcompdump="${ZDOTDIR:-$HOME}/.zcompdump"
if [[ -f "$zcompdump" && $(date +'%j') == $(stat -f '%Sm' -t '%j' "$zcompdump" 2>/dev/null) ]]; then
compinit -C -d "$zcompdump"
else
compinit -d "$zcompdump"
fi
}What Frameworks Hide
Dropping Oh My Zsh exposed a second category of invisible work: keybindings and shell integration. Up-arrow history search, Option+Delete for word deletion, directory reporting for new terminal tabs — OMZ provided all of these silently. Without it, you need explicit bindkey calls and an OSC 7 hook to tell Ghostty which directory to inherit.
This is the trade-off of framework removal. You trade startup cost for configuration awareness. Every convenience the framework provided becomes something you either replicate manually or decide you don't need.
The Pattern
| Layer | What Hid the Cost |
|---|---|
.zshenv NVM load | zprof's scope (only profiles .zshrc) |
| 997 compdef calls | Plugin list inertia (never audited) |
| Eager compinit | Missing -C flag (cache exists, unused) |
| NVM itself | "Standard" tool (slow assumed normal) |
| OMZ keybindings | Invisible until removed |
The shell optimization story has a surface reading — "remove stuff, go faster" — but the interesting part is about measurement scope. zprof is a good tool with a blind spot: it only sees .zshrc. NVM is slow but its real cost is placement in .zshenv, not its absolute runtime. The lesson generalizes beyond shells: if your profiler reports "fast" but the system is slow, the bottleneck is in a phase the profiler doesn't instrument.
| Metric | Before | After |
|---|---|---|
| Shell startup | 557ms | 76ms |
| Profiler-visible cost | 1652ms | 21ms |
| Profiler-invisible cost | ~500ms | ~3ms |
| Node manager | NVM (137ms) | fnm (2.6ms) |