How AI Agents Remember Things

Video thumbnail for How AI Agents Remember Things

AI agents are inherently stateless, with no memory between conversations. This video breaks down how memory systems work in practice, using OpenClaw as a case study to show that you don't need vector databases or complex retrieval pipelines — markdown files and well-timed write mechanisms are enough.

The Problem: Stateless by Default

  • AI models have no memory between calls — each conversation is just an increasingly long context window passed on every turn
  • Without a memory system, every new conversation starts without any context from previous ones

Session vs. Long-Term Memory

Session Memory

  • The history of a single conversation, passed on each subsequent LLM call
  • LLMs have a finite context window, so a process called compaction kicks in as you approach limits
  • Compaction distills conversation history into the most important information to allow the conversation to continue

Three Compaction Strategies

  • Count-based: Triggered when token size or turn count exceeds a threshold
  • Time-based: Triggered in the background when user stops interacting for a period
  • Event-based / Semantic: Triggered when the agent detects a task or topic has concluded — most intelligent but hardest to implement

Long-Term Memory

What survives at the end of a session. Think of the session as a messy desk for your current project, and long-term memory as the filing cabinet where things are categorized and stored.

Google's Memory Framework

From Google's November 2025 white paper "Context Engineering: Sessions and Memory", agent memory breaks into three types:

  • Episodic: What happened in past conversations — events and interactions
  • Semantic: Pure facts and user preferences — what the agent knows about you or a topic
  • Procedural: Workflows and learned routines — how to accomplish a task

What Makes Memory Effective

  • Targeted filtering: Not every detail is worth remembering — extract key concepts and facts, like human memory
  • Consolidation: Collapse duplicate or near-duplicate entries into a single entity (e.g., three separate "dark mode preference" entries become one)
  • Overwriting: Preferences and facts change over time — the system must differentiate and update, or memory becomes noisy and contradictory

OpenClaw: A Real-World Example

Three Core Components

  • memory.md file: Semantic memory store with stable facts, preferences, and identity info. Loaded into every prompt with a recommended 200-line cap
  • Daily logs: Episodic memory organized by day. Append-only — entries are added but never removed
  • Session snapshots: Episodic memory triggered when starting a new session. Captures the last 15 meaningful messages (user and assistant only, no tool calls or system messages) as raw conversation text
At its core, OpenClaw's memory is just markdown files. But the files are only half the story — without something that reads and writes them at the right times, they're just sitting there doing nothing.

Four Mechanisms That Make It Work

  • 1. Bootstrap loading at session start: memory.md is automatically injected into the prompt. The agent's instructions tell it to also read today's and yesterday's daily logs for recent context
  • 2. Pre-compaction flush: When nearing the context window limit, a silent agentic turn instructs the LLM to save anything important to the daily log. This turns a destructive operation into a checkpoint, following the write-ahead log pattern from databases
  • 3. Session snapshot on new session: Triggered by /new or /reset commands. A hook grabs the last conversation chunk, filters to meaningful messages, and the LLM generates a descriptive filename
  • 4. User just asks: When a user says "remember this," the agent routes the information to either memory.md (semantic) or the daily log (episodic) based on its instructions

Key Takeaway

You don't need a complex setup to give an agent memory. You just need clear instructions to three questions: What's worth remembering? Where does it go? And when does it get written?

Claude Code's memory feature uses the same approach — markdown files. The pattern is simple, effective, and doesn't require vector databases or specialized infrastructure.