Claude+Code+完整源码泄露了，花了一天读完全部源码，这是我发现的.pdf

pdf 2.11 mb

将附件的技术文章翻译为英文

我来将这篇关于 Claude Code 源码泄露的技术文章翻译成英文。

Claude Code Complete Source Code Leaked: What I Found After Reading All 510,000 Lines

Title: Claude Code Complete Source Code Leaked: What I Found After Reading All 510,000 Lines
Author: Yufeng He
Original Link: https://zhuanlan.zhihu.com/p/2022389695955346888

Today's Big News (March 31, 2026): Anthropic's Claude Code Complete Source Code Leaked

The cause was quite absurd. Someone discovered residual files in an npm package containing a download link pointing to an Anthropic R2 storage bucket—no authentication required, direct download. Unzip the archive: 1,903 files, 510,000 lines of TypeScript, all laid bare.

@anthropic-ai/claude-code .map

As a heavy Claude Code user (and an AI Agent engineer), I cloned it immediately and read through the entire codebase upon hearing the news. Many articles online are already recapping the directory structure and tech stack, so I won't cover those. This piece focuses only on what I found genuinely interesting after reading the code.

Tech Stack First

Bun + TypeScript + React + Ink. Using React for the TUI layer isn't new—Ink has been around since 2017, used by Gatsby CLI and Prisma CLI. However, Claude Code's scenario is far more complex than typical CLIs: multiple Agents running in parallel, streaming output, user interrupts during tool execution, permission pop-ups. At this level of state management complexity, React makes more sense than hand-rolling everything.

Agentic Loop: A Single `while(true)` Powers the Entire Agent

The core file of Claude Code is src/query.ts at 1,729 lines. Note: this isn't QueryEngine.ts (that's the outer session management). The real "brain" is in query.ts—a while(true) loop.

TypeScript

async function* queryLoop(params) {
  let state = { messages, toolUseContext, turnCount: 1, ... }
  while (true) {
    // 1) Preprocessing: trim history, compress context, prefetch memory and skills
    // 2) Call Claude API (streaming)
    // 3) Stream output while watching for tool_use blocks
    // 4) If tool_use found → check permissions → execute → push result to messages → back to while
    // 5) No tool calls → exit
  }
}

It looks simple, but the devil is in the details.

Context Management: Not One-Size-Fits-All, But Four Surgical Knives

Anyone who's used Claude Code knows it automatically "compresses" long conversations. I previously assumed it just summarized early messages. Reading the source revealed four different compression mechanisms working simultaneously:

HISTORY_SNIP: The finest granularity—directly deletes certain messages without summarization. For example, if a tool returns 500 lines of search results but the model only uses 3 lines, the remaining 497 lines are pure noise. Keeping them wastes tokens; summarizing them also wastes tokens. Deletion is the most economical choice.
Microcompact: Leverages API-layer capabilities to edit at the cache level. It doesn't modify message content but tells the API "these tokens are in your cache but don't use them." This reduces token count without touching the messages.
CONTEXT_COLLAPSE: "Archives" old conversation turns into summaries, maintaining a git log-like structure. Each new query replays this structure. Unlike autocompact, it preserves structure—what happened in which turn and what conclusions were reached remain clear, not mashed into a single summary blob.
Autocompact: The final fallback—calls the model once to compress the entire history into a single paragraph.

These four mechanisms execute sequentially; if earlier ones handle it, the later ones don't trigger. So most of the time, autocompact never runs.

My takeaway: Agent context management can't rely on a single strategy. Intermediate tool outputs might become useless after a few turns, but user requirements need preservation throughout the session. Information has different "expiration dates" and deserves different treatments.

Streaming Tool Parallelism: Tools Start Working While the Model Is Still Talking

Typical Agent implementation: wait for model to finish → check for tool calls → execute → return results → next turn. There's obvious waiting time in between.

Claude Code doesn't wait.

TypeScript

// StreamingToolExecutor.ts
export class StreamingToolExecutor {
  // Model streams a tool_use block, execution starts immediately
  addTool(block: ToolUseBlock, message: AssistantMessage): void { ... }

  // Concurrency-safe tools run in parallel, write operations are exclusive
  // Results queued in receive order to ensure deterministic output
  async *getRemainingResults(): AsyncGenerator<MessageUpdate> { ... }
}

While the model is still streaming subsequent content, earlier tools are already running. Each tool has an isConcurrencySafe flag: read-only operations like file reading and grep can run in parallel; write operations like file editing and bash require exclusive access. Results are buffered in receive order to prevent misordering.

Users of Claude Code may notice its tool execution feels fast—this is one reason. Tool execution latency is hidden within model inference time, barely perceptible to users.

Refusing to Give Up When Hitting Output Limits

TypeScript

const MAX_OUTPUT_TOKENS_RECOVERY_LIMIT = 3

Model output hit max_output_tokens? The loop doesn't error out—it "detains" the error message, silently retries, up to 3 times. Completely invisible to users.

Above this code is a comment written in medieval wizard style:

"Heed these rules well, young wizard. For they are the rules of thinking, and the rules of thinking are the rules of the universe. If ye does not heed these rules, ye will be punished with an entire day of debugging and hair-pulling."

Whoever maintains this code has clearly been burned many times.

Tool System: 40+ Tools, Zero Inheritance

Those who've built Agent frameworks might be used to writing a BaseTool base class and inheriting from it. Claude Code has no inheritance whatsoever—all 40+ tools are pure functional factory functions via buildTool():

TypeScript

type ToolDef<T> = {
  name: string
  description: string
  inputSchema: ZodSchema<T>           // Zod v4 for validation + auto-generated JSON Schema
  call(input: T, ctx: ToolUseContext): AsyncGenerator<...>
  isReadOnly(): boolean
  getPermissions(): ToolPermission[]
  renderToolUse?(input: T): ReactNode  // Renders directly to terminal
  getToolUseSummary?(input, result): string  // Summary for context compression
}

Each tool is completely self-contained: schema, permissions, execution logic, UI rendering, compression summary—all in one file. No global registry; each session dynamically assembles the tool pool, mixing static tools, MCP tools, and Agent-defined tools.

The most complex is BashTool at 1,143 lines. It does far more than exec(command):

Auto-parses commands into search/read/write categories for permission matching
Uses sandbox-exec sandbox on macOS, seccomp on Linux
Auto-backgrounds blocking commands exceeding 15 seconds
Large outputs stored to disk, only a file path reference given to the model
Built-in sed command parser—detects sed -i and UI changes from "Bash" to file editing style
Compound commands like ls && git push are split and evaluated for safety segment by segment

A single BashTool's complexity rivals many small Agent frameworks entirely.

Feature Flag: The Cleanest Feature Gating I've Ever Seen

Compile-Time: Code Physically Disappears

TypeScript

import { feature } from 'bun:bundle'
const voiceModule = feature('VOICE_MODE') ? require('./voice/index.js') : null

feature() is a Bun compile-time macro. At build time, it's replaced with true or false; false branches are physically deleted. Not "runtime non-execution"—physically gone from the binary, string literals and all.

Why? Because security researchers decompile binaries looking for hidden features. Runtime flags, even when disabled, leave strings behind. Compile-time DCE (Dead Code Elimination) is true "non-existence."

Ironically, despite all this binary-level protection, it was all undone by a forgotten .map file. I found over 20 compile-time flags, each corresponding to an unreleased feature: VOICE_MODE, BRIDGE_MODE, DAEMON, KAIROS, COORDINATOR_MODE, PROACTIVE, ABLATION_BASELINE, CONTEXT_COLLAPSE, CHICAGO_MCP...

Runtime: GrowthBook A/B Testing

TypeScript

const enabled = checkStatsigFeatureGate_CACHED_MAY_BE_STALE(
  'tengu_streaming_tool_execution2'
)

Used for gradual rollouts and emergency kill switches. All gate names start with tengu_—Tengu (天狗) is Claude Code's internal project codename. Read from disk cache, accepts stale reads, non-blocking at startup.

Ablation Experiments: Scientific Method Applied to Product Engineering

This discovery surprised me. There's a flag called ABLATION_BASELINE that, when enabled, turns off thinking mode, context compression, auto-memory, and background tasks all at once:

TypeScript

if (feature('ABLATION_BASELINE') && process.env.CLAUDE_CODE_ABLATION_BASELINE) {
  for (const k of [
    'CLAUDE_CODE_DISABLE_THINKING',
    'DISABLE_COMPACT',
    'DISABLE_AUTO_COMPACT',
    'CLAUDE_CODE_DISABLE_AUTO_MEMORY',
    'CLAUDE_CODE_DISABLE_BACKGROUND_TASKS',
  ]) {
    process.env[k] ??= '1';
  }
}

Anyone who's done ML research knows ablation studies: turn off components one by one to measure their impact on final results. But applying this methodology to production engineering—in industrial code—is a first for me.

This means Anthropic can run controlled experiments quantifying the value of every new feature (thinking, compact, memory...) before launch. Not "feels useful so ship it"—"data proves it's useful so ship it."

Hidden Features: What's in the Source But Not Yet Released

The features gated by compile-time flags are invisible in public binaries, but the source exposes everything.

Voice Mode (Codename: Amber Quartz)

The src/voice/ directory confirms voice mode exists:

Only supports Claude.ai OAuth authentication (API keys, Bedrock, Vertex don't work)
Uses dedicated voice_stream endpoint
Has emergency kill switch: tengu_amber_quartz_disabled
Comments suggest development is complete, just not yet public

Bridge Mode: Turn Your Computer into Claude's Remote Terminal

src/bridge/ contains 31 files implementing a complete remote control system. Run claude remotecontrol and your local environment becomes a "bridge environment" remotely controllable by claude.ai.

Supports up to 32 concurrent sessions, JWT authentication with trusted device mechanisms. Enterprise admins can disable via policy. This likely enables claude.ai web version to directly operate your local dev environment without manual copy-paste.

Buddy: A Virtual Pet in Your Terminal

This was absolutely the most unexpected discovery in the entire source.

Claude Code contains a complete virtual pet system—and it's not feature-flagged, it's already in every user's binary:

TypeScript

// 18 species
export const SPECIES = [
  duck, goose, blob, cat, dragon, octopus, owl, penguin,
  turtle, snail, ghost, axolotl, capybara, cactus, robot,
  rabbit, mushroom, chonk
] as const

// 5 rarity tiers
export const RARITY_WEIGHTS = {
  common: 60, uncommon: 25, rare: 10, epic: 4, legendary: 1,
}

// RPG-style stats
export const STAT_NAMES = ['DEBUGGING', 'PATIENCE', 'CHAOS', 'WISDOM', 'SNARK'] as const

18 species, 5 rarity tiers (1% legendary chance), 1% shiny variants. Plus a hat system (crown, top hat, propeller hat, halo, wizard hat, beanie, duck-on-head) and different eye styles. Pet stats are deterministically calculated from user ID using a Mulberry32 PRNG—each user gets one pet, no rerolling.

Finding a complete gacha pet-raising system in a 510,000-line serious engineering project is quite amusing.

A Secret Hidden in Pet Names

But more interesting than the pets themselves is how species names are encoded:

TypeScript

const c = String.fromCharCode
export const duck = c(0x64, 0x75, 0x63, 0x6b) as 'duck'
export const goose = c(0x67, 0x6f, 0x6f, 0x73, 0x65) as 'goose'
export const capybara = c(0x63, 0x61, 0x70, 0x79, 0x62, 0x61, 0x72, 0x61) as 'capybara'

All 18 species names are hex-encoded, none in plaintext. The comment explains: "One species name collides with a model-codename canary in [wurrzag]"

One pet name happens to be Anthropic's internal codename for an unreleased model. The build system greps binaries for blacklist strings, so hex encoding is required to bypass detection.

Which one is it? This requires combining with another leak to figure out.

Model Codenames: Two Leaks Complete the Puzzle

Three days before this npm leak (March 28), Anthropic had another incident: a CMS database with open permissions was accessed by a Fortune reporter, revealing nearly 3,000 internal documents. Among them was mention of an unreleased model called Claude Mythos, internal codename Capybara, positioned as a new tier above Opus.

In Claude Code's source prompts.ts, I found numerous @[MODEL LAUNCH] comments (TODO checklists for new model releases) repeatedly mentioning this name:

TypeScript

// @[MODEL LAUNCH]: Update comment writing for Capybara
// —remove or soften once the model stops over-commenting by default
// @[MODEL LAUNCH]: False-claims mitigation for Capybara v8
// (29-30% FC rate vs v4's 16.7%)

These TODOs are incomplete, indicating Capybara hasn't launched publicly. But main.tsx already contains model aliases like capybara-fast and capybara-v2-fast, showing Anthropic employees are already using them internally.

Putting it together: Capybara is the internal codename for Mythos, the next-generation flagship stronger than Opus. Opus 4.6 is called claude-opus-4-6 in code—no animal name. The collision in the pet system is this capybara.

The leak also revealed internal data: Capybara v8's false claim rate is 29-30%, nearly double v4's 16.7%. Anthropic didn't roll back versions but added prompt-level instructions as patches, using internal employees as guinea pigs to verify effectiveness.

Another model codename appears in the source:

TypeScript

// @[MODEL LAUNCH]: Remove this section when we launch numbat.1

Numbat (袋食蚁兽), another pending release. Its relationship to Capybara is currently unclear.

Additionally, Claude Code's project codename is Tengu (天狗)—not sure what the moon refers to 😂

Skill System and Multi-Agent Coordination

Briefly mentioning two designs worth referencing.

Skills are just Markdown files. Place .md files in .claude/skills/ directory, write description, trigger conditions, allowed tools, and which model to use in YAML frontmatter. Claude Code automatically loads skills when it detects them in the directory—no explicit registration needed. "Convention over configuration," very Rails.

The multi-Agent coordinator is surprisingly simple. In Coordinator mode, the main Agent has only three tools: spawn worker, send message to worker, stop worker. Workers don't get TeamCreate or SendMessage, preventing infinite nesting. Backend supports tmux pane, in-process, and remote three modes.

Several Easily Overlooked Engineering Details

Privacy protection written into type names: Analytics data types are named AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS. Using the type name itself to remind developers: "Have you confirmed this isn't code or file paths?" Simple but effective.

Speculative execution: AppState contains speculationState, tracking how each turn ended (bash/file edit/normal exit/permission denied) to predict next actions and pre-execute. This explains why Claude Code sometimes "thinks" then instantly starts working.

Cold start optimization: The --version path achieves zero imports, directly reading compile-time inlined version numbers, exiting without loading any modules. Other subcommands take independent paths. Only the main loop loads the full React app. An 800KB React app—features you don't use don't get loaded, naturally fast startup.

Cause of the Leak

Forgot to delete .map files when publishing to npm; the map referenced an R2 source zip URL with no access control. That simple.

A reminder for all npm package publishers:

Use whitelist approach for package.json files field—only include what you intend to publish
Add CI step to check for .map files in release artifacts
Source archive URLs must have authentication—don't leave them naked on CDN
Build artifacts and source access control should be managed independently

They did so much leak prevention at the binary level (compile-time DCE, hex-encoded species names, excluded-strings blacklist), all undone by a forgotten map file. Security is only as strong as its weakest link—99 out of 100 correct still equals zero if one fails.

Final Thoughts

After reading 510,000 lines of code, my biggest takeaway isn't any specific technical brilliance—it's that this team approaches engineering with research methodology.

Ablation experiment infrastructure, dual-layer feature flags, four-granularity context management, streaming tool parallelism—none were added on a whim; each likely has data backing it. This "every feature has quantitative validation" engineering culture is more worth learning than any single technique.

Except for the pet system. That was purely for fun.

I'm still reading the source and may write deeper analysis of specific modules later (the 1,143-line BashTool sandbox and command security mechanisms, or Coordinator mode multi-Agent orchestration details—which would you prefer?).

This is my first time receiving so many likes and favorites as a Zhihu nobody—humbled and grateful. I may post more updates on AI Agent frontiers and thoughts, currently focusing on agents and multi-agents.

My GitHub: https://github.com/he-yufeng
My LinkedIn: www.linkedin.com/in/yufenghe

This article is based on technical analysis of publicly leaked source code. All code copyright belongs to Anthropic.

将附件的技术文章翻译为英文

Claude Code Complete Source Code Leaked: What I Found After Reading All 510,000 Lines

Today's Big News (March 31, 2026): Anthropic's Claude Code Complete Source Code Leaked

Tech Stack First

Agentic Loop: A Single while(true) Powers the Entire Agent

Context Management: Not One-Size-Fits-All, But Four Surgical Knives

Streaming Tool Parallelism: Tools Start Working While the Model Is Still Talking

Refusing to Give Up When Hitting Output Limits

Tool System: 40+ Tools, Zero Inheritance

Feature Flag: The Cleanest Feature Gating I've Ever Seen

Compile-Time: Code Physically Disappears

Runtime: GrowthBook A/B Testing

Ablation Experiments: Scientific Method Applied to Product Engineering

Hidden Features: What's in the Source But Not Yet Released

Voice Mode (Codename: Amber Quartz)

Bridge Mode: Turn Your Computer into Claude's Remote Terminal

Buddy: A Virtual Pet in Your Terminal

A Secret Hidden in Pet Names

Model Codenames: Two Leaks Complete the Puzzle

Skill System and Multi-Agent Coordination

Several Easily Overlooked Engineering Details

Cause of the Leak

Final Thoughts

Agentic Loop: A Single `while(true)` Powers the Entire Agent