The harness, not the model — Claude Code in large codebases

Core · for everyone

01 — How Claude reads code

It searches like an engineer, not a database.

Claude Code walks the file system, reads files, greps for what it needs, and follows references across the tree. There is no index to build, maintain, or upload. Every instance works from the live codebase on your machine.

That choice — agentic search instead of retrieval over a pre-built index — is the foundation everything else rests on. Flip between the two approaches below to see why it matters as a codebase grows and changes.

your prompt→ grep · read · follow refs→ live codebase

No pipeline, no central index. The same files the team committed five minutes ago are the files Claude reads. Nothing to fall out of date. The tradeoff: Claude needs enough starting context to know where to look — which is what the rest of this page is about.

your prompt→ embedded index→ retrieved chunks

Embeddings lag behind a busy team. At scale, the pipeline can't re-embed fast enough. Retrieval hands back a function renamed two weeks ago, or a module deleted last sprint — with no flag that either is stale.

returns: getUserToken() — renamed 14 days ago

The quality of Claude's navigation is shaped by how well the codebase is set up. Teams that invest in setup see better results.

Ask it to find a vague pattern across a billion lines and you'll hit context limits before the work starts. The fix isn't a bigger model — it's giving Claude a map. That map is the harness.

Core · for everyone

02 — The five extension points

The harness matters as much as the model.

The most common misconception is that Claude Code's ability is defined by the model alone. In practice the ecosystem around the model — the harness — determines performance more than the model does.

Five extension points, plus two capabilities that round out the setup. Order matters: each layer builds on the one before it. Tap any row to expand it.

↓ build in this order — each layer assumes the one above ↓

CLAUDE.md filesevery session+

Context files Claude reads automatically — come first.

A root file for the big picture, subdirectory files for local conventions. They give Claude the codebase knowledge it needs to do anything well.

Because they load every session regardless of task, keep them focused on what applies broadly — or they become a drag on performance.

Common mistake: stuffing reusable expertise here that belongs in a skill.

Hookson events+

Scripts at key moments — and they make the setup self-improving.

Most teams think of hooks as guardrails that stop Claude doing something wrong. The more valuable use is continuous improvement: a stop hook can reflect on a session and propose CLAUDE.md updates while context is fresh; a start hook can load team-specific context dynamically.

For linting and formatting, hooks enforce rules deterministically — more consistent than asking Claude to remember an instruction.

Common mistake: using a prompt for something that should run automatically.

Skillson demand+

Expertise available when relevant, without bloating every session.

In a codebase with dozens of task types, not all expertise belongs in every session. Skills use progressive disclosure — specialized workflows load only when the task calls for them. A security-review skill loads when assessing vulnerabilities; a doc-processing skill loads when a change needs docs updated.

Skills can be scoped to paths: a payments team binds their deploy skill to that directory, so it never auto-loads elsewhere in the monorepo.

Common mistake: loading everything into CLAUDE.md instead.

Pluginsalways, once set+

Bundle skills, hooks, and MCP configs into one installable package.

Good setups tend to stay tribal. A plugin bundles everything so a new engineer gets the same context and capabilities on day one. Updates distribute org-wide through managed marketplaces.

Common mistake: letting good setups stay tribal instead of packaging them.

LSP integrationsalways, once set+

Symbol-level navigation — search by symbol, not by string.

The Language Server Protocol is what powers "go to definition" and "find all references" in your IDE. Surfaced to Claude, it gives symbol-level precision: follow a call to its definition, trace references across files, distinguish identically named functions. Without it, Claude pattern-matches on text and can land on the wrong symbol.

For multi-language codebases (especially C / C++), this is one of the highest-value investments. LSP is accessed through the plugin layer.

Common mistake: assuming it's automatic — it isn't.

MCP serversalways, once set+

Connect Claude to internal tools, data, and APIs it can't otherwise reach.

The Model Context Protocol is how Claude reaches internal documentation, ticketing systems, or analytics platforms. The most sophisticated teams build MCP servers exposing structured search as a tool Claude calls directly.

Common mistake: building MCP connections before the basics are working.

Subagentswhen invoked+

Split exploration from editing — a delegation capability.

A subagent is an isolated Claude instance with its own context window. It takes a task, does the work, and returns only the final result to the parent. Once the harness is in place, spin up a read-only subagent to map a subsystem and write findings to a file — then the main agent edits with the full picture.

Common mistake: running exploration and editing in the same session.

Core · for everyone

03 — Making code navigable

Give Claude a map it can read in one glance.

Too much context in every session degrades performance; too little leaves Claude navigating blind. These are the upfront investments that make a codebase legible. Tap each one as you put it in place.

✓

Keep CLAUDE.md lean and layered

Root file = pointers and critical gotchas only. Subdirectory files = local conventions. Claude loads them additively as it walks the tree; everything else drifts into noise.

✓

Initialize in subdirectories, not the repo root

Claude works best scoped to the relevant part. It walks up the tree and loads every CLAUDE.md it finds, so root context is never lost — even when you start deep.

✓

Scope test & lint commands per subdirectory

Running the full suite for a one-service change causes timeouts and wastes context. Put the commands that apply locally in that directory's CLAUDE.md.

✓

Use ignore rules to exclude noise

Commit permissions.deny rules in .claude/settings.json to version-control the exclusions (generated files, build artifacts, third-party code), so every developer gets the same noise reduction.

✓

Build a codebase map when structure doesn't

A lightweight markdown file listing each top-level folder with a one-line description gives Claude a table of contents to scan before opening files. Layer it for hundreds of folders.

✓

Run LSP servers so search is by symbol

Grep for a common function name returns thousands of matches; Claude burns context figuring out which one matters. LSP filters to the same symbol before Claude reads anything.

0 / 6 in place — start with the top of the list.

One caveat: even the hierarchical CLAUDE.md approach has edge cases — codebases with hundreds of thousands of folders, or legacy systems on non-Git version control. Those are coming in future installments of the series.

Core · for everyone

04 — Maintenance over time

Yesterday's workaround is tomorrow's straitjacket.

As models improve, instructions written for today's model can work against a future one. A rule that kept an earlier model on track can stop a newer one from doing work it now handles well. Toggle the model below and watch the same rule flip from helpful to harmful.

CLAUDE.md · line 47

"Break every refactor into single-file changes."

Helpful. This earlier model loses the thread on multi-file edits, so the constraint keeps it on track.

The same applies to skills and hooks built to compensate for a specific limitation. A hook that intercepted writes to enforce p4 edit in a Perforce repo became redundant the moment Claude Code shipped native Perforce mode.

Plan a real configuration review every three to six months — and whenever performance feels like it's plateaued after a major model release.

At scale · for teams & orgs

05 — The organizational layer

Technical config alone doesn't drive adoption.

The rollouts that spread fastest made a dedicated infrastructure investment before broad access. A small team — sometimes one person — wired up the tooling so Claude already fit developer workflows on first contact. First experiences were productive, and adoption spread from there.

Who owns it — pick the level that fits

Minimum viable

A DRI — one directly responsible individual

One person with ownership over the configuration and the authority to make calls on settings, permissions policy, the plugin marketplace, and CLAUDE.md conventions — plus the responsibility to keep them current.

Emerging role

An agent manager

A hybrid PM / engineer function dedicated to managing the Claude Code ecosystem. Tends to sit under developer experience or developer productivity — the team already responsible for onboarding and tooling.

Full investment

A dedicated team, in place before rollout

At one company, a couple of engineers built a suite of plugins and MCPs available on day one. At another, a whole team had the infrastructure ready before access opened.

Bottoms-up enthusiasm fragments without someone to centralize what works. Someone has to assemble and evangelize the conventions — a standardized CLAUDE.md hierarchy, a curated set of skills and plugins. Without that, knowledge stays tribal and adoption plateaus.

At scale · for teams & orgs

06 — Governance, early

Answer the hard questions before they're asked.

In large organizations — especially regulated ones — governance questions surface fast. Three come up almost every time:

Who controls which skills and plugins are available?

How do you stop thousands of engineers rebuilding the same thing?

How does AI-generated code go through the same review as human code?

The pattern that works: start with a defined set of approved skills, required code-review processes, and limited initial access — then expand as confidence builds. The smoothest deployments stood up a cross-functional working group early, bringing engineering, infosec, and governance together to define requirements and build a rollout roadmap.

07 — Check yourself

Did the layering land?

Three questions. The traps are the real-world confusions the article calls out.

1. You have reusable expertise for security reviews. Where does it belong?

Right: skills use progressive disclosure. Putting it in CLAUDE.md loads it every session and competes for context — the most common mistake.

2. In a monorepo, where should you initialize Claude for a task?

Right: scoping to the relevant subdirectory keeps Claude focused. It automatically walks up the tree and loads every CLAUDE.md along the way, so nothing is lost.

3. Performance plateaus right after a major model release. First move?

Right: instructions written for an older model can fight a newer one. A config review every 3–6 months — and after big releases — is the recommended habit.

It searches like an engineer, not a database.

The harness matters as much as the model.

Give Claude a map it can read in one glance.

Yesterday's workaround is tomorrow's straitjacket.

Technical config alone doesn't drive adoption.

Who owns it — pick the level that fits

Answer the hard questions before they're asked.

Did the layering land?

Want this adapted to your repo and workflow?