As models get smarter, engineers spend more time staring at a screen, playing glorified QA tester. This lesson hands that time back.
Linters, IDEs, formatters, type checkers, compilers — nearly all of it was written to make human teams faster. But humans aren't writing most of the code anymore. Agents are. So the toolchain deserves a fresh look.
Formatters, linters, and symbol servers serve agents just as well as they serve us. Claude wields them effectively.
Humans make silent assumptions about a codebase that an agent simply doesn't share.
The question to carry through this lesson: what does an agent need from your codebase that a human takes for granted?
This is advanced material. Before the real techniques land, these prerequisites should already be in place. Tap each one to check it off.
Each builds on the one before it. Taken together, they let you work in a way we simply haven't worked before.
Teach Claude to check its own work, so its output becomes reliable.
Once it's reliable, run many Claudes at once with confidence.
Take your keyboard out of the hot path entirely. Claude works while you don't.
Think about the last feature you shipped. How did you verify it — not just the final output, but each iteration along the way? Most software work breaks into the same sequence.
The same exact playbook works for Claude. Nothing about this sequence is uniquely human. The only requirement is giving Claude the right tools and the right instructions to walk through it the way you would.
A loop is an autonomous circuit you complete for Claude so it can hill-climb toward a success criterion. Give it tools to write code and to verify that code, and it cycles: write → check → debug → write again — until it reaches a success state. The pull-request that lands in your inbox is higher quality because it already passed.
The core concept never changes — give Claude the tools and instructions to enter a loop. Get that right and these all merge into one capability.
Drive the browser, prove a fix visually.
Hit endpoints, check responses and state.
The whole app, infrastructure included.
Concretely, it boils down to four moves:
None of this is novel — state-setup scripts are old hat in end-to-end testing. The twist: give Claude access to those scripts and keep them dynamic, not prescriptive, so it can do far more than a static script ever could.
A skill stores arbitrary context about a topic — here, your verification loop. So you can hand it to a teammate, or to your future self. The best part: make it self-improving.
Tell the skill to edit itself every time Claude hits a blocker, and it becomes self-documenting. The Claude Code team runs exactly one verification skill, explicitly told to keep documenting itself. Hit a wall once, and the next person never hits it.
The live demo used Monkeytype — an open-source typing tester (TypeScript + Express, with MongoDB and Redis). A realistic full-stack app. The arc: drive it by hand with the Chrome MCP, distill the session into a skill, then build a new feature and let the skill verify it.
That's the loop in the wild: write code, hit lint errors, fix them, re-verify — circling until it reaches a good state. Set one up yourself and you'll likely be running in 5–10 minutes.
Hold Claude's hand once. Then let it fold the lesson into a skill it can reuse forever.
Once each Claude is reliable, you can parallelize. The catch: every open session eats your attention, and attention is scarce. Past four or five live sessions, most people stall.
So the whole game is protecting attention. Four surfaces help:
A sidebar with every session across every surface — local terminal, cloud, all git repos. Pin, rename, and color them so you remember what each was doing.
Love the terminal? Run claude agents. Same sidebar idea, sorted by attention needed — anything blocked on a prompt floats to the top.
Decouple sessions from your laptop. Walk between meetings, lose your wifi — it runs on. Start at claude.ai/code.
Run /remote-control and steer any session from your phone. It buzzes when Claude needs input — answer from the car.
The old way: a tmux window manager with four panes, each on its own git worktree. It works — but it's a lot to manage. Claude agents replaced it.
Even multi-Claude isn't enough — you still have to spin up each session with a goal in mind. But a lot of engineering isn't features or bugs. It's bookkeeping that needs a loop, just not you in it.
Review comments, merge conflicts, CI failures. Twenty PRs a day eats hours.
Velocity goes up; docs have to keep pace.
Monitoring feedback, keeping the build green.
The /loop command wakes a session on a schedule, runs your prompt, and — if your CLAUDE.md and tools are set up — figures out the rest itself.
Routines are /loop running in the cloud, in the same containers as Claude Code on the web. Set one up from the web or desktop app under the Routines tab. Triggers come in two kinds:
Routine work that doesn't need you — handled, on a schedule, without a keyboard.
Reliable verification makes parallelism safe. Parallelism makes background loops worth running. Together they form a system that does real work while you're nowhere near the keyboard.
Spend your attention on the work you actually care about. Delegate the rest — with high reliability and high confidence.