Your CLAUDE.md loads into every request, survives every /compact, and sits in context for the entire session. At 200 lines, it costs roughly 1,500–2,000 tokens — before you have typed a single character. At 500 lines, that number climbs past 5,000. And since Claude Code’s effective working context is closer to 170K tokens after system prompts and tool schemas claim their share of the 200K window, a bloated CLAUDE.md is a tax you pay on every single exchange.
This guide covers how to measure what you actually have, what to cut without breaking things, and a handful of structural patterns that recover 30–50% of token spend while keeping — or improving — Claude’s actual behavior.
Browse real CLAUDE.md examples from open-source projects in our gallery to see how lean files look in practice.
Measure First
Before cutting anything, know what you are working with.
Count tokens in your current CLAUDE.md
# Using the tiktoken CLI (install with: pip install tiktoken)
python3 -c "
import tiktoken
enc = tiktoken.get_encoding('cl100k_base')
with open('.claude/CLAUDE.md', 'r') as f:
text = f.read()
tokens = enc.encode(text)
print(f'{len(tokens)} tokens, {len(text)} chars, {len(text.splitlines())} lines')
"
Claude uses its own tokenizer, but cl100k_base is close enough for planning purposes. Within 5–10%.
Understand the full context tax
CLAUDE.md is not the only cost before your first message. Here is what typically occupies context before you type anything:
| Component | Typical Token Cost |
|---|---|
| Claude Code system prompt | 8,000–12,000 |
| Tool schemas (built-in) | 4,000–6,000 |
| MCP server tool schemas | 1,000–3,000 per server |
Global ~/.claude/CLAUDE.md | 500–2,000 |
Project ./CLAUDE.md | 1,000–5,000+ |
| Total before first message | 25,000–40,000 |
If you have four MCP servers loaded and a 400-line CLAUDE.md, you may be starting every session with 45,000 tokens already consumed. That is 22% of your context window gone before any code is read.
What to Cut (And Why It Is Safe)
Most CLAUDE.md files have grown over time, accumulating rules that felt important when added but rarely affect behavior in practice. Here is where to look.
1. Obvious-to-Claude instructions
Experienced Claude Code users often write instructions that Claude already follows by default:
# These are usually safe to delete
- Write clean, readable code
- Add comments to explain complex logic
- Follow the existing code style in the file
- Use meaningful variable names
- Don't leave console.log statements in production code
Claude Code does these by default. They cost tokens and change nothing. Delete them and test — if behavior degrades, add the specific instruction back.
2. Role-definition preambles
# Before (expensive and ineffective)
You are an expert senior software engineer with 15 years of experience
in TypeScript, React, Node.js, and distributed systems. You write
production-quality code that is tested, documented, and maintainable.
You approach problems methodically and...
# After (5 tokens instead of 50+)
Expert-level TypeScript/React/Node.js. Production quality only.
Role preambles came from the prompt engineering days when they mattered more. With Claude 3.7+, the model understands its role. The preamble adds context weight without changing output quality meaningfully.
3. Redundant negations
# Verbose (every line costs ~15-20 tokens)
- Do not use var, always use const or let
- Do not use require(), always use import
- Do not write functions longer than 50 lines
- Do not use any, always type explicitly
- Do not commit directly to main
# Tighter (same rules, fewer tokens)
TypeScript strict: const/let, import, explicit types, max 50 lines/fn. No direct commits to main.
Same effective behavior, roughly 70% fewer tokens.
4. Instructions that belong in the code
Some rules are better enforced by tooling than by asking Claude to remember them:
# Remove from CLAUDE.md — configure in your tools instead:
- Always run prettier before committing
- Ensure no TypeScript errors before marking a task done
- Run tests after every significant change
Move these to hooks instead:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit|MultiEdit",
"hooks": [{
"type": "command",
"command": "cd $CLAUDE_CODE_DIR && npx tsc --noEmit 2>&1 | head -20"
}]
}
]
}
}
A hook that runs TypeScript checks automatically is more reliable than an instruction that Claude may or may not follow. Remove the instruction, keep the enforcement.
The Lazy-Loading Pattern
Not every rule applies to every task. A CLAUDE.md that covers database migrations, API design, frontend components, and CLI tooling loads all of that context even when you are fixing a typo in a README.
The solution is to split your CLAUDE.md into a lean core plus specialized imports.
Core CLAUDE.md (always loaded, target: under 100 lines)
# Project: [name]
## Stack
TypeScript 5.3, React 18, PostgreSQL 16, Node.js 22
## Critical rules
- Never modify schema files without migration
- All API routes require auth middleware
- Environment variables: see .env.example
## Subdirectory guides
- Frontend work → read ./src/app/CLAUDE.md
- API work → read ./src/api/CLAUDE.md
- Database work → read ./prisma/CLAUDE.md
- CI/CD changes → read ./.github/CLAUDE.md
## Run commands
- Dev: `pnpm dev`
- Test: `pnpm test`
- Build: `pnpm build`
Subdirectory CLAUDE.md files (loaded only when relevant)
# ./src/api/CLAUDE.md
## API conventions
- Route handlers in handlers/, business logic in services/
- Return type: `{ data: T } | { error: string }`
- Auth: all routes use `requireAuth()` middleware from lib/auth
- Rate limiting: apply `rateLimit('standard')` to public endpoints
## Common patterns
[specific code examples for API work only]
Claude Code automatically reads the nearest CLAUDE.md in the directory tree. When Claude is working in ./src/api/, it reads both the root CLAUDE.md and ./src/api/CLAUDE.md. When working in ./src/app/, it reads the root and ./src/app/CLAUDE.md. The database rules never load during frontend work.
This cuts effective CLAUDE.md size by 40–70% for any given task, while making the relevant rules more focused and thus more likely to be followed.
Compression Techniques That Work
When a rule needs to stay in the core file, compress it aggressively.
Use reference format instead of prose
# Before: 8 lines, ~80 tokens
When writing API endpoints, always follow RESTful conventions.
Use HTTP verbs correctly: GET for reads, POST for creates,
PUT for full updates, PATCH for partial updates, DELETE for
removes. Return appropriate status codes: 200 for success,
201 for created, 400 for bad request, 404 for not found,
500 for server errors.
# After: 2 lines, ~25 tokens
REST: GET/POST/PUT/PATCH/DELETE. Status: 200 ok, 201 created,
400 bad req, 404 not found, 500 server err.
Claude Code does not need grammatically correct sentences. It needs the information.
Use code blocks for examples, not prose descriptions
A 10-line code example often conveys more than 30 lines of prose — and a code block gets the point across faster. If you need to explain a pattern, show it:
# Component pattern:
```tsx
// Correct
export function UserCard({ user }: { user: User }) {
return <div className="card">{user.name}</div>
}
// Wrong — no default exports, no inline styles
export default function UserCard() { ... }
The code block is unambiguous and more efficient than describing the convention in words.
### Delete the archaeology
CLAUDE.md files accumulate historical context: "we migrated from Redux to Zustand in Q3 2025 so you may see old Redux patterns", "the auth module was rewritten after the security audit — avoid the /legacy/ directory". This context served a purpose when the migration was fresh. After a few months, it is just dead weight.
Audit your file for time-sensitive context that no longer applies. Delete it.
## Measuring the Impact
After restructuring, measure again:
```bash
# Count tokens before and after
python3 -c "
import tiktoken
enc = tiktoken.get_encoding('cl100k_base')
files = ['.claude/CLAUDE.md', 'src/api/CLAUDE.md', 'src/app/CLAUDE.md']
total = 0
for f in files:
try:
with open(f) as fh:
t = len(enc.encode(fh.read()))
print(f'{f}: {t} tokens')
total += t
except FileNotFoundError:
pass
print(f'Total: {total} tokens')
"
A well-optimized setup typically lands at:
- Core CLAUDE.md: 400–800 tokens
- Per-subdirectory CLAUDE.md: 200–500 tokens each
- Any given task loads 600–1,300 tokens of project context
Compare that to a monolithic 500-line CLAUDE.md costing 5,000+ tokens on every request. The savings compound across hundreds of exchanges in a long session.
What Not to Cut
Some content earns its token cost and should stay even in a lean file.
Security and access controls. “Never commit .env files”, “do not log authentication tokens”, “production database credentials are not in this repo — use Vault”. These prevent costly mistakes. Keep them prominent.
Architecture constraints that are non-obvious. “The event bus is single-threaded — never await inside an event handler” is not something Claude will infer. Document it.
The counterintuitive decisions. “We use server-side rendering for the dashboard but CSR for the editor — this is intentional, do not suggest converting to SSR”. Context for decisions that look wrong saves repeated conversations.
Run commands. Claude Code reads these constantly. They should stay in the root CLAUDE.md, not buried in a subdirectory file.
Quick-Start Checklist
Run through this when trimming an existing CLAUDE.md:
- Count current token cost with tiktoken
- Delete all “obvious to Claude” instructions and test
- Convert role preambles to single compressed lines
- Move automated checks to hooks
- Split project-area rules into subdirectory CLAUDE.md files
- Compress remaining prose rules to reference format
- Delete stale migration/historical context
- Recount — target is under 1,000 tokens for the root file
- Test with a real task to verify behavior is unchanged
The goal is not minimalism for its own sake. It is making sure every token in your CLAUDE.md is doing useful work.