Memory

What It Is

Memory is a pattern where an agent stores information from past interactions and retrieves it when relevant to current tasks. Without memory, every conversation starts from scratch — the agent has no knowledge of previous interactions, decisions, or user preferences. With memory, the agent builds up context over time, becoming more useful and personalized with each interaction.

Memory in AI agents is inspired by human memory systems, but implemented through a combination of databases, embeddings, and retrieval mechanisms rather than biological processes.

Why It Matters

A stateless agent is like a colleague with amnesia — you have to re-explain everything every time. Memory transforms agents from one-shot tools into persistent collaborators that accumulate knowledge, learn preferences, and build on past work.

The landmark Generative Agents paper (Park et al. 2023) demonstrated that agents with memory could form relationships, remember past conversations, and coordinate activities over time — emergent behaviors that weren’t explicitly programmed but arose naturally from the memory architecture.

How It Works

Agent memory systems typically implement multiple types of memory:

Short-term memory (working context)

The current conversation history and any documents loaded into the context window. This is what the agent can “see” right now. It’s limited by the model’s context window size.

Long-term memory (persistent storage)

Information stored outside the context window in a database or file system, retrieved when relevant. This includes:

Episodic memory — Records of specific past interactions (“Last Tuesday, the user asked about exchange policies and preferred email communication”).
Semantic memory — General knowledge extracted from interactions (“This user prefers concise responses” or “This customer is a VIP account”).
Procedural memory — Learned workflows and procedures (“When processing exchanges for this company, always check the VIP discount table first”).

Retrieval

When the agent starts a new task, it queries long-term memory for relevant context. Common retrieval methods include:

Keyword/semantic search — Finding memories related to the current topic using embeddings
Recency weighting — Prioritizing recent memories over older ones
Importance scoring — Surfacing memories that were flagged as significant

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Current     │────▸│  Retrieval    │────▸│  Enriched    │
│  Task        │     │  (search      │     │  Context     │
│              │     │   memory)     │     │              │
└─────────────┘     └──────┬───────┘     └─────────────┘
                           │
                    ┌──────▼───────┐
                    │  Long-term    │
                    │  Memory Store │
                    └──────────────┘

Example

Customer exchange scenario

A customer contacts support about an exchange. The agent retrieves from memory:

Episodic: “This customer contacted us 3 weeks ago about a defective Blue Widget. We sent a replacement. They mentioned they prefer phone updates over email.”
Semantic: “This customer has been with us for 4 years and has a lifetime value of $2,400. They’re in the VIP tier.”
Procedural: “VIP customers get free expedited shipping on exchanges and a courtesy discount on the replacement item.”

Armed with this context, the agent provides personalized service without the customer having to repeat their history.

Development workflow

A coding agent with memory:

Remembers the project’s architecture decisions from previous sessions
Recalls that the user prefers TypeScript over JavaScript and tabs over spaces
Knows which test framework the project uses without being told each time
Stores debugging insights (“The authentication module has a known issue with token refresh timing”)

Claude Code’s CLAUDE.md files and memory directory are a practical implementation of this pattern — project context and user preferences persist across sessions.

When to Use It

Customer-facing systems where personalization improves the experience
Long-running projects where context accumulates over time
Agents that need to learn from mistakes (paired with Reflection)
Any workflow where repeating context to the agent is a friction point

Reflection — Reflection generates insights that can be stored in memory for future use
Planning — Memory of past plans helps the agent make better plans in the future
Multi-Agent Collaboration — Shared memory allows agents to coordinate without direct communication
Agent Capability Patterns