Claude Code Architecture

AI Agent from IPO Model to Extension Mechanisms


01. IPO — The Universal Model

Mọi AI agent đều tuân theo mô hình Input → Process → Output, nhưng khác với chương trình truyền thống (1 lần I→P→O), agent chạy vòng lặp ReAct[^1]: output của tool call quay lại thành input cho lần inference tiếp theo. Vòng lặp kết thúc khi response không chứa tool call nào.

---
config:
  theme: base
  themeVariables:
    primaryColor: "#d4edda"
    primaryTextColor: "#1a1a1a"
    primaryBorderColor: "#28a745"
    lineColor: "#666"
    secondaryColor: "#e8daef"
    tertiaryColor: "#d1ecf1"
---
flowchart LR
    subgraph I["📥 INPUT"]
        direction TB
        I1["User prompt"]
        I2["CLAUDE.md / Memory"]
        I3["Conversation history"]
        I4["Tool definitions"]
        I5["Skill descriptions"]
    end

    subgraph P["🧠 PROCESS"]
        direction TB
        P1["LLM Inference"]
        P2["ReAct Loop"]
    end

    subgraph O["📤 OUTPUT"]
        direction TB
        O1["Text response"]
        O2["File changes"]
        O3["Shell commands"]
        O4["Agent results"]
    end

    I ==>|"context window"| P ==>|"generation"| O
    O -.->|"tool results feed back"| I

    style I fill:#d4edda,stroke:#28a745,stroke-width:2px
    style P fill:#e8daef,stroke:#8e44ad,stroke-width:2px
    style O fill:#d1ecf1,stroke:#17a2b8,stroke-width:2px

ReAct Loop

---
config:
  theme: neutral
  look: classic
---
flowchart LR
    A(["Input"]):::green --> B["LLM Inference"]:::purple --> C["PreToolUse Hook"]:::blue --> D["Tool Execution"]:::cyan --> E(["Observation"]):::orange
    E -.->|"loop until done"| A

    classDef green fill:#d4edda,stroke:#28a745,color:#1a1a1a
    classDef purple fill:#e8daef,stroke:#8e44ad,color:#1a1a1a
    classDef blue fill:#cce5ff,stroke:#004085,color:#1a1a1a
    classDef cyan fill:#d1ecf1,stroke:#0c5460,color:#1a1a1a
    classDef orange fill:#ffeeba,stroke:#856404,color:#1a1a1a

    linkStyle 4 stroke:#e07070,stroke-width:2px,stroke-dasharray:5

02. AI Agent Brain — 3 Components

---
config:
  theme: neutral
  look: classic
---
flowchart TB
    LLM["🧠 LLM — The Brain
Reasoning · Planning · Judgment
Code generation · Tool selection"]:::purple CTX["📋 Context Window — Working Memory
1M tokens[^6] · auto-compact at ~95%[^2]
System Prompt + Tools + History + Results"]:::blue MEM["💾 Memory — Long-term Storage
CLAUDE.md · MEMORY.md · Topic files
Load on-demand into context"]:::orange LLM <-->|"inference reads
from context"| CTX MEM -->|"load on-demand"| CTX classDef purple fill:#e8daef,stroke:#8e44ad,stroke-width:2px,color:#1a1a1a classDef blue fill:#cce5ff,stroke:#004085,stroke-width:2px,color:#1a1a1a classDef orange fill:#ffeeba,stroke:#856404,stroke-width:2px,color:#1a1a1a

Human Brain ↔ AI Agent

Human Brain AI Agent
Neurons + Synapses LLM (weights + inference)
Working Memory (7±2 items) Context Window (token limit)
Long-term Memory Memory (CLAUDE.md, MEMORY.md)
Hands / Eyes / Mouth Tools (Read, Edit, Bash...)
Habits / Reflexes Hooks (auto event triggers)
Delegation to others Sub-agents (isolated workers)

WHY CONTEXT WINDOW MATTERS

Context window là nút cổ chai trung tâm của mọi AI agent. Mọi cơ chế mở rộng đều nhằm quản lý hiệu quả tài nguyên này: Tools thực thi ngoài context nhưng kết quả phải quay về. Skills chỉ load mô tả ngắn (budget 1% context window, mỗi skill ≤250 chars)[^10], full load on-demand. MCP defer tool schemas qua ToolSearch[^8] — chỉ tên tool trong context cho đến khi cần. Sub-agents cách ly context riêng[^7], chỉ trả summary. Hooks chạy ngoài context — command hooks không tốn token, nhưng output inject vào context thì có (≤10K chars)[^9].

Context Window Allocation (~1M tokens)

---
config:
  theme: base
  themeVariables:
    pie1: "#8e44ad"
    pie2: "#d4a017"
    pie3: "#2874a6"
    pie4: "#5060c0"
    pie5: "#148f77"
    pie6: "#bdc3c7"
---
pie title What fills the context window
    "System prompt" : 12
    "CLAUDE.md + Memory" : 8
    "Tool schemas" : 6
    "Skill descriptions" : 5
    "Conversation + Results" : 35
    "Available space" : 34

At ~95% usage → auto-compact: summarize old history, preserve recent[^2]


03. Evolution — How Each Mechanism Evolved

Track 1: Actions — Tool → MCP

---
config:
  theme: neutral
  look: classic
---
flowchart LR
    A["Hardcoded
Functions"]:::dim -->|"standardize"| B["Function
Calling"]:::dim -->|"ship built-in"| C["Built-in
Tools"]:::green -->|"open protocol"| D["🌐 MCP"]:::cyan classDef dim fill:#f0f0f0,stroke:#999,color:#666 classDef green fill:#d4edda,stroke:#28a745,color:#1a1a1a classDef cyan fill:#d1ecf1,stroke:#0c5460,color:#1a1a1a,stroke-width:2px

Track 2: Workflows — Command → Skill

---
config:
  theme: neutral
  look: classic
---
flowchart LR
    A["Built-in
/commands"]:::dim -->|"customizable"| B["Custom
Commands"]:::yellow -->|"structured
+ proactive"| C["📘 Skills"]:::accent classDef dim fill:#f0f0f0,stroke:#999,color:#666 classDef yellow fill:#fff3cd,stroke:#856404,color:#1a1a1a classDef accent fill:#d6d9f0,stroke:#5060c0,color:#1a1a1a,stroke-width:2px

Track 3: Context — Single → Multi-Agent

---
config:
  theme: neutral
  look: classic
---
flowchart LR
    A["Single
Context"]:::dim -->|"auto-manage"| B["Auto-
Compaction"]:::orange -->|"isolate"| C["🤖 Sub-Agents"]:::red classDef dim fill:#f0f0f0,stroke:#999,color:#666 classDef orange fill:#ffeeba,stroke:#856404,color:#1a1a1a classDef red fill:#f8d7da,stroke:#721c24,color:#1a1a1a,stroke-width:2px

Track 4: Automation — Manual → Hooks

---
config:
  theme: neutral
  look: classic
---
flowchart LR
    A["Manual
Only"]:::dim -->|"persistent"| B["CLAUDE.md"]:::orange -->|"deterministic"| C["⚙ Hooks"]:::purple classDef dim fill:#f0f0f0,stroke:#999,color:#666 classDef orange fill:#ffeeba,stroke:#856404,color:#1a1a1a classDef purple fill:#e8daef,stroke:#8e44ad,color:#1a1a1a,stroke-width:2px

All 4 Tracks Converge → Plugin

---
config:
  theme: neutral
  look: classic
---
flowchart TB
    T["🌐 MCP"]:::cyan
    S["📘 Skills"]:::accent
    A["🤖 Sub-Agents"]:::red
    H["⚙ Hooks"]:::purple

    T & S & A & H --> PL(["📦 Plugin Package"]):::dim

    classDef cyan fill:#d1ecf1,stroke:#0c5460,color:#1a1a1a,stroke-width:2px
    classDef accent fill:#d6d9f0,stroke:#5060c0,color:#1a1a1a,stroke-width:2px
    classDef red fill:#f8d7da,stroke:#721c24,color:#1a1a1a,stroke-width:2px
    classDef purple fill:#e8daef,stroke:#8e44ad,color:#1a1a1a,stroke-width:2px
    classDef dim fill:#e9ecef,stroke:#6c757d,color:#1a1a1a,stroke-width:3px

04. Sub-Agent — Context Isolation

---
config:
  theme: neutral
  look: classic
---
flowchart TD
    P["🖥 Parent Session
context: task + summaries"]:::accent subgraph agents["Sub-Agents — isolated context each"] direction LR E["🔍 Explore
haiku · read-only"]:::red G["⚡ General
inherit · full tools"]:::red C["🎯 Custom
plugin-defined"]:::red W["🌲 Worktree
git isolation"]:::red end R(["✅ Summary
~500 tok ÷40x"]):::green P -->|"spawn + brief"| agents agents -->|"summary only"| R R -->|"result"| P classDef accent fill:#d6d9f0,stroke:#5060c0,color:#1a1a1a,stroke-width:2px classDef red fill:#f8d7da,stroke:#721c24,color:#1a1a1a classDef green fill:#d4edda,stroke:#28a745,color:#1a1a1a,stroke-width:2px style agents fill:#fff5f5,stroke:#c0392b,stroke-width:1px,stroke-dasharray:5

Context savings: 20K work → 500 tok result = ~40x reduction

  • Problem: Grep 40 files = 20,000 tokens vào context → loãng reasoning.
  • Solution: Agent có context riêng, trả summary only. Parent chỉ tăng 500 tokens.
  • Trade-off: 1.2x-25x API cost, nhưng bảo toàn context quality.

05. Full Architecture Stack

---
config:
  theme: neutral
  look: classic
---
flowchart TB
    subgraph L1["🖥 USER INTERFACE"]
        direction LR
        U1["Terminal"] ~~~ U2["VS Code"] ~~~ U3["JetBrains"] ~~~ U4["claude.ai"] ~~~ U5["Desktop"]
    end

    subgraph L2["⚙ AGENT HARNESS"]
        direction LR
        H1["ReAct Loop"] ~~~ H2["Permissions"] ~~~ H3["Hook Engine"] ~~~ H4["Compaction"] ~~~ H5["Plugin Loader"]
    end

    subgraph L3["📋 CONTEXT WINDOW"]
        direction LR
        C1["System Prompt"] ~~~ C2["CLAUDE.md"] ~~~ C3["Tool Schemas"] ~~~ C4["Skills Meta"] ~~~ C5["Conv + Results"]
    end

    subgraph L4["🛠 CAPABILITIES"]
        direction LR
        T1["Read/Edit/Write"] ~~~ T2["Bash/Glob/Grep"] ~~~ T3["MCP Servers"] ~~~ T4["Skill tool"] ~~~ T5["Agent tool"]
    end

    subgraph L5["💾 PERSISTENCE"]
        direction LR
        P1["Context"] ~~~ P2["CLAUDE.md"] ~~~ P3["Memory files"] ~~~ P4["Git history"] ~~~ P5["External"]
    end

    subgraph L6["🤖 MULTI-AGENT"]
        direction LR
        A1["Explore"] ~~~ A2["General"] ~~~ A3["Plan"] ~~~ A4["Custom"] ~~~ A5["Worktree"] ~~~ A6["Teams"]
    end

    subgraph L7["🧠 LLM FOUNDATION"]
        direction LR
        M1["Opus 4.6"] ~~~ M2["Sonnet 4.6"] ~~~ M3["Haiku 4.5"] ~~~ M4["Prompt Cache"]
    end

    L1 --- L2 --- L3 --- L4 --- L5 --- L6 --- L7

    style L1 fill:#d1ecf1,stroke:#0c5460,stroke-width:2px
    style L2 fill:#e8daef,stroke:#8e44ad,stroke-width:2px
    style L3 fill:#cce5ff,stroke:#004085,stroke-width:2px
    style L4 fill:#d4edda,stroke:#28a745,stroke-width:2px
    style L5 fill:#ffeeba,stroke:#856404,stroke-width:2px
    style L6 fill:#f8d7da,stroke:#721c24,stroke-width:2px
    style L7 fill:#e8daef,stroke:#8e44ad,stroke-width:2px

06. Plugin — The Distribution Unit

---
config:
  theme: neutral
  look: classic
---
flowchart TB
    subgraph PL["📦 PLUGIN PACKAGE"]
        direction LR
        SK["📘 skills/
SKILL.md + refs"]:::accent AG["🤖 agents/
agent-name.md"]:::red CM["📝 commands/
legacy → skills"]:::yellow HK["⚙ hooks/
hooks.json"]:::purple MC["🌐 .mcp.json
server config"]:::cyan end classDef accent fill:#d6d9f0,stroke:#5060c0,color:#1a1a1a classDef red fill:#f8d7da,stroke:#721c24,color:#1a1a1a classDef yellow fill:#fff3cd,stroke:#856404,color:#1a1a1a classDef purple fill:#e8daef,stroke:#8e44ad,color:#1a1a1a classDef cyan fill:#d1ecf1,stroke:#0c5460,color:#1a1a1a style PL fill:#f8f9fa,stroke:#6c757d,stroke-dasharray:5,stroke-width:2px

NATURAL EVOLUTION

  • ToolMCP — chuẩn hoá protocol, giải N×M problem
  • CommandSkill — workflow phức tạp + Claude tự kích hoạt
  • Single contextSub-agent — cách ly context, parallel execution
  • InstructionsHook — deterministic, chạy 100% khi event fire

07. Summary — Everything Manages Context

Mechanism Problem Solved Context Cost Lineage
Built-in Tool LLM cần hành động ngoài text Schema tĩnh + results tích luỹ core primitive
MCP[^3] N×M integration, portability ToolSearch defer[^8], load on-demand Tool → standardized
Command Reusable prompts Full content inject khi gõ core primitive
Skill[^10] Complex workflows + auto-activate 1% context / 8K chars, on-demand Command → proactive
Hook[^9] Deterministic automation Cmd hooks: zero. Inject: ≤10K CLAUDE.md → deterministic
Sub-Agent[^7] Context pollution, parallel exec ~500 tok / 20K work Context → isolated
Plugin Distribution + packaging Bundles all above meta-unit

How Everything Connects

---
config:
  theme: base
  themeVariables:
    primaryColor: "#d6d9f0"
    primaryTextColor: "#1a1a1a"
    primaryBorderColor: "#5060c0"
    lineColor: "#666"
    secondaryColor: "#d4edda"
    tertiaryColor: "#d1ecf1"
    clusterBkg: "#f8f9fa"
    clusterBorder: "#5060c0"
---
flowchart TB
    USER(["👤 User"])

    USER -->|"/slash"| CMD["📝 Command"]:::yellow
    USER -->|"prompt"| CW

    subgraph SESSION["🖥 Claude Code Session"]
        LLM["🧠 LLM"]:::purple
        CW["📋 Context Window"]:::blue
        LLM <--> CW
    end

    CMD -->|"inject prompt"| CW

    CW -->|"Skill tool"| SKILL["📘 Skill"]:::accent
    SKILL -->|"inject workflow"| CW

    CW -->|"tool call"| TOOL["🛠 Built-in Tool"]:::green
    TOOL -->|"result"| CW

    CW -->|"MCP call"| MCP["🌐 MCP Server"]:::cyan
    MCP -->|"result"| CW

    CW -->|"Agent tool"| SUB["🤖 Sub-Agent"]:::red
    SUB -->|"summary only"| CW

    HOOK["⚙ Hook"]:::purple2
    HOOK -->|"auto-fire"| SESSION

    PLUGIN["📦 Plugin"]
    PLUGIN -.->|"provides"| SKILL & SUB & HOOK & MCP

    classDef yellow fill:#fff3cd,stroke:#856404,color:#1a1a1a
    classDef purple fill:#e8daef,stroke:#8e44ad,color:#1a1a1a
    classDef blue fill:#cce5ff,stroke:#004085,color:#1a1a1a,stroke-width:2px
    classDef green fill:#d4edda,stroke:#28a745,color:#1a1a1a
    classDef cyan fill:#d1ecf1,stroke:#0c5460,color:#1a1a1a
    classDef accent fill:#d6d9f0,stroke:#5060c0,color:#1a1a1a
    classDef red fill:#f8d7da,stroke:#721c24,color:#1a1a1a
    classDef purple2 fill:#e8daef,stroke:#8e44ad,color:#1a1a1a

    style SESSION fill:#f8f9fa,stroke:#5060c0,stroke-width:2px
    style PLUGIN fill:#e9ecef,stroke:#6c757d,stroke-dasharray:5
    style USER fill:#fff,stroke:#666

THE UNIFIED PRINCIPLE

Mọi cơ chế mở rộng của Claude Code đều phục vụ một mục tiêu: tối đa hoá chất lượng reasoning trong giới hạn context window.[^11]

Context window là tài nguyên quý nhất. Mọi thiết kế đều xoay quanh nó.


08. References

Academic Papers

[^1]: Yao, S. et al. (2023). ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023. arXiv:2210.03629

Official Documentation

[^2]: Claude Code — Auto-compaction threshold ~95%. anthropics/claude-code#15719

[^3]: Model Context Protocol — Architecture: JSON-RPC 2.0, stdio + Streamable HTTP, 3 server primitives (Tools, Resources, Prompts). Launched November 25, 2024. modelcontextprotocol.io

[^6]: Anthropic (2025). Claude Opus 4.6 & Sonnet 4.6 — 1M token context window. platform.claude.com

[^7]: Claude Code — Sub-agents: isolated context, no nesting. code.claude.com/sub-agents

[^8]: Claude Code — MCP tool definitions deferred by default, loaded on-demand via ToolSearch. code.claude.com

[^9]: Claude Code — Hooks: output injected into context capped at 10,000 characters. code.claude.com/hooks

[^10]: Claude Code — Skills: budget 1% context window (fallback 8K chars), each entry ≤250 chars. code.claude.com/skills

[^11]: Claude Code — Agent loop: gather context → take action → verify results. code.claude.com

[^12]: Anthropic (2025). Prompt Caching — cache read tokens at 0.1x base input price. platform.claude.com


CT AI Blog — Cập nhật 04/2026

Bài viết liên quan