How AI Agents Remember Things

AI agents are inherently stateless, with no memory between conversations. This video breaks down how memory systems work in practice, using OpenClaw as a case study to show that you don't need vector databases or complex retrieval pipelines — markdown files and well-timed write mechanisms are enough.

The Problem: Stateless by Default

AI models have no memory between calls — each conversation is just an increasingly long context window passed on every turn
Without a memory system, every new conversation starts without any context from previous ones

Session vs. Long-Term Memory

Session Memory

The history of a single conversation, passed on each subsequent LLM call
LLMs have a finite context window, so a process called compaction kicks in as you approach limits
Compaction distills conversation history into the most important information to allow the conversation to continue

Three Compaction Strategies

Count-based: Triggered when token size or turn count exceeds a threshold
Time-based: Triggered in the background when user stops interacting for a period
Event-based / Semantic: Triggered when the agent detects a task or topic has concluded — most intelligent but hardest to implement

Long-Term Memory

What survives at the end of a session. Think of the session as a messy desk for your current project, and long-term memory as the filing cabinet where things are categorized and stored.

Google's Memory Framework

From Google's November 2025 white paper "Context Engineering: Sessions and Memory", agent memory breaks into three types:

Episodic: What happened in past conversations — events and interactions
Semantic: Pure facts and user preferences — what the agent knows about you or a topic
Procedural: Workflows and learned routines — how to accomplish a task

What Makes Memory Effective

Targeted filtering: Not every detail is worth remembering — extract key concepts and facts, like human memory
Consolidation: Collapse duplicate or near-duplicate entries into a single entity (e.g., three separate "dark mode preference" entries become one)
Overwriting: Preferences and facts change over time — the system must differentiate and update, or memory becomes noisy and contradictory

OpenClaw: A Real-World Example

Three Core Components

memory.md file: Semantic memory store with stable facts, preferences, and identity info. Loaded into every prompt with a recommended 200-line cap
Daily logs: Episodic memory organized by day. Append-only — entries are added but never removed
Session snapshots: Episodic memory triggered when starting a new session. Captures the last 15 meaningful messages (user and assistant only, no tool calls or system messages) as raw conversation text

At its core, OpenClaw's memory is just markdown files. But the files are only half the story — without something that reads and writes them at the right times, they're just sitting there doing nothing.

Four Mechanisms That Make It Work

1. Bootstrap loading at session start: memory.md is automatically injected into the prompt. The agent's instructions tell it to also read today's and yesterday's daily logs for recent context
2. Pre-compaction flush: When nearing the context window limit, a silent agentic turn instructs the LLM to save anything important to the daily log. This turns a destructive operation into a checkpoint, following the write-ahead log pattern from databases
3. Session snapshot on new session: Triggered by /new or /reset commands. A hook grabs the last conversation chunk, filters to meaningful messages, and the LLM generates a descriptive filename
4. User just asks: When a user says "remember this," the agent routes the information to either memory.md (semantic) or the daily log (episodic) based on its instructions

Key Takeaway

You don't need a complex setup to give an agent memory. You just need clear instructions to three questions: What's worth remembering? Where does it go? And when does it get written?

Claude Code's memory feature uses the same approach — markdown files. The pattern is simple, effective, and doesn't require vector databases or specialized infrastructure.