CC-303: Persistent Agent Memory
Listen instead
Learning Guide
Claude Code sessions are ephemeral by default. Every conversation starts from zero -- no knowledge of past decisions, learned preferences, or project history. Persistent agent memory changes this fundamentally. It gives Claude Code a durable knowledge layer that spans sessions, enabling it to accumulate expertise about your projects, remember your preferences, and build on past work rather than rediscovering it. This module covers the full memory architecture: from low-level storage tools to high-level workflow patterns.
MCP-Based Memory Systems
Claude Code's memory system is built on the Model Context Protocol (MCP), the same open standard used for tool integration. An MCP memory server runs alongside Claude Code, providing specialized tools for storing and retrieving memories. This architecture means memory is not a proprietary black box -- it's a server you control, with data stored where you choose, in formats you can inspect and modify.
The memory server exposes tools through MCP just like any other tool server. Claude Code calls memory_store to save information and memory_recall to retrieve it. The server handles persistence, indexing, and semantic search behind the scenes.
The memory_store and memory_recall Tools
memory_store
This tool persists a piece of information with metadata that aids future retrieval. When you store a memory, you provide:
- Content: The information itself -- a fact, preference, decision, or context note.
- Type: The memory category (preference, fact, context, decision, procedure, trajectory, learning).
- Tags/Keywords: Metadata that helps the recall engine surface this memory when relevant.
// Storing a project decision
memory_store({
content: "PostgreSQL 16 chosen over MySQL for JSONB support and
row-level security. Decision made 2025-01-15.",
type: "decision",
tags: ["database", "postgresql", "architecture"]
})
memory_recall
This tool retrieves relevant memories based on a natural language query. The recall engine uses semantic similarity (not just keyword matching) to find memories that are contextually relevant, even if they don't share exact words with the query.
// Retrieving relevant context
memory_recall({
query: "database schema conventions for this project"
})
// Returns: decision about PostgreSQL, schema naming conventions,
// column naming patterns, index strategies -- all previously stored
Memory Types in Depth
Different types of information serve different purposes. Categorizing memories by type enables smarter recall and lifecycle management.
- Preference: How the user likes things done. Editor choices, coding style, communication preferences, tool configurations. These rarely change and should be recalled frequently. Example: "User prefers vi, never nano."
- Fact: Objective information about the project or environment. Schema details, API patterns, architecture decisions. Facts are durable and form the foundation of project knowledge. Example: "The users table has a firstName column, not displayName."
- Context: Session-specific information that may be relevant to future sessions. Current state of work, blockers, in-progress features. Context memories have shorter lifespans than facts. Example: "Migration from class components to hooks is 60% complete, 80 components remaining."
- Decision: Recorded choices with rationale. Why a particular library was chosen, why an approach was rejected, what tradeoffs were accepted. Decisions prevent re-litigating settled questions. Example: "Chose Drizzle ORM over Prisma for type safety and migration control."
- Procedure: Step-by-step processes that worked and should be repeated. Deployment workflows, debugging runbooks, setup sequences. Example: "To deploy: run tests, build, push to staging, verify health check, promote to production."
- Trajectory: Records of multi-step problem-solving paths. What was tried, what failed, what worked. Trajectories help avoid repeating failed approaches. Example: "Fixed the SSE memory leak by adding cancel() handler to ReadableStream constructor."
- Learning: Insights gained from experience. Gotchas, anti-patterns, non-obvious behaviors. Example: "Drizzle gt(column, value) not gt(value, column) -- column always comes first."
Semantic Embeddings for Recall
The memory system converts both stored memories and recall queries into semantic embeddings -- high-dimensional vector representations that capture meaning rather than just keywords. This means a query about "database connection pooling" can surface a memory stored as "PostgreSQL pool timeout set to 30 seconds" even though they share no exact terms.
The embedding model runs locally or through a configured API. Vector similarity search (typically cosine similarity) ranks memories by relevance. The top-k most relevant memories are returned to the agent, providing rich context without requiring the agent to know the exact wording used when the memory was stored.
Cross-Session Persistence
The defining feature of the memory system is persistence across sessions. When you close a Claude Code session, memories survive. When you start a new session tomorrow -- or next month -- the agent can recall everything that was stored. This creates a cumulative knowledge effect: the agent gets smarter about your project over time.
Cross-session persistence is especially powerful for long-running projects. Instead of re-explaining your architecture, conventions, and preferences in every session, you store them once. The memory system's auto-recall can even surface relevant memories at session start based on the project context.
Memory Lifecycle: Temporal Classes, Decay, and Deadlines
Not all memories should live forever. The memory lifecycle system manages this through temporal classes:
- Permanent: Core facts, preferences, and decisions that remain valid indefinitely. "The project uses TypeScript strict mode."
- Long-lived: Information valid for weeks or months but expected to change. "Current sprint focus is notifications feature."
- Session-scoped: Context relevant to a specific work session. "Currently debugging the login redirect loop."
- Deadline-bound: Information with a known expiration. "Feature freeze is March 15."
Memory decay models can automatically reduce the retrieval priority of aging memories. A context memory from three months ago is less likely to be relevant than one from yesterday. The memory system can also prune expired or superseded memories to keep the store clean and retrieval fast.
The MEMORY.md Index Pattern
While the MCP memory server handles semantic recall, the MEMORY.md file serves as a structured index of critical project knowledge. It lives alongside CLAUDE.md and is loaded at session start. Unlike the memory server (which is queried dynamically), MEMORY.md is always present in context.
MEMORY.md is best used for information that is needed in virtually every session: schema gotchas, API patterns, test infrastructure details, known limitations, and project statistics. It serves as a quick-reference card that prevents the most common mistakes.
# Project Memory
## Key Schema Facts
- users.firstName (NOT displayName)
- agents table: createdBy/ownedBy (NOT ownerId)
## API Patterns
- authenticateRequest(request) -> AuthenticatedRequest | NextResponse
- requirePermission(authResult, 'resource.action') -> null | NextResponse
## Gotchas
- Drizzle gt(column, value) NOT gt(value, column)
- Content-Type enforcement must skip no-body POSTs
Design Pattern: Use MEMORY.md for "always-on" critical facts (schema gotchas, API conventions). Use the MCP memory server for the long tail of contextual knowledge that is queried on-demand.
Auto-Memory in CLAUDE.md
Your CLAUDE.md configuration can instruct Claude Code to automatically recall relevant memories at specific points in the workflow. The SessionStart hook, for example, can trigger an automatic memory_recall with project-relevant keywords, pre-loading the agent with context before you even ask your first question.
Other auto-memory patterns include:
- PreToolUse: Before executing a tool, check memory for known issues or gotchas related to that tool or file.
- PostToolUse: After an error occurs, check memory for previously encountered errors and their solutions.
- PreCompact: Before context compaction, save important context to memory so it survives the compaction.
The Memory-First Workflow Rule
The most impactful workflow pattern for memory is simple: always check memory before doing anything else. When you need information about a project, a past decision, a configuration detail, or a debugging approach -- recall first. Only explore the codebase, ask the user, or spawn agents if memory doesn't have the answer.
This rule has compound returns. Every time you store useful information, future sessions benefit. Every time you recall before exploring, you save context window tokens and wall-clock time. Over weeks and months, the memory system becomes an invaluable knowledge base that makes the agent increasingly effective.
When to Use Memory vs. Plans vs. Tasks
Claude Code offers several persistence mechanisms. Choosing the right one matters:
- Memory (memory_store/recall): For knowledge that spans sessions and projects. Facts, preferences, decisions, procedures, learnings. Memory is queried by semantic similarity.
- MEMORY.md: For always-on critical knowledge that should be in every session's context. A curated subset of the most important memories.
- Plans: For multi-step implementation roadmaps with specific ordering and dependencies. Plans are consumed and completed, not recalled. Use plans when you have a concrete sequence of work to execute.
- Tasks: For tracking individual work items within a plan or project. Tasks have status (todo, in-progress, done) and are more granular than plans.
The general heuristic: if the information is reusable knowledge, store it in memory. If it's an action to take, put it in a plan or task. If it's critical enough to be in every session, add it to MEMORY.md.
For the complete reference on Claude Code's memory capabilities, see the Claude Code Memory documentation.