Why AI agents need three types of memory: Neo4j's context graphs

Neo4j proposes a 'context graph' architecture with three memory layers for autonomous agents: long-term knowledge, conversation history and decision traces. Without this structure, agents lose track of their goals and become unpredictable in production.

By Neo4j (Jim Webber, Chief Scientist) · June 1, 2026.

The article, written by Jim Webber, Chief Scientist at Neo4j, opens with a clear premise: the leap from chatbots to autonomous agents is not just a matter of the model, but of data infrastructure. According to Webber, current systems fail in production because their 'memory' is reduced to a conversation buffer and a static knowledge base. The agent reads the goal, plans actions, queries facts and runs a similarity search—but after several cycles it loses track of the original plan or the reasoning behind each decision, and ends up doing something different from what it was asked.

**The concept of the context graph**

The solution Neo4j proposes is called the 'context graph.' Foundation Capital had already identified this pattern as a relevant architectural trend in the infrastructure of agentic systems. What Neo4j adds is an operational taxonomy: instead of treating 'context' as a monolithic category, it breaks it down into three functionally distinct but interconnected memory layers, which together form what Webber calls a 'Sim City data model': composable, flexible and independently processable by layer.

**First layer: long-term memory (enterprise knowledge)**

At the base of the context graph lies enterprise knowledge—slow-moving, almost immutable facts: the geographic location of a building, the capital of a country, the interactions between molecules and receptors in bioinformatics, or the state of a public transport network. This is the agent's long-term memory. Many companies already have high-fidelity knowledge graphs, curated by both experts and algorithms. The rate of change of this data is low, but its value is high: it provides the 'domain truth' that corrects the gaps in the model's training and reduces hallucinations.

**Second layer: short-term memory (conversation history)**

At the intermediate level sits the conversation history, which manages more volatile information: what the user asked, what the agent is working on, what it has already completed in previous sessions or what knowledge it needs for the current task. This layer captures the agent's state and the history of messages, sessions and conversations. Its function is to prevent 'context drift': the agent forgetting which tasks it has already performed or which knowledge it has consulted. In addition, this layer enables multi-agent orchestration, since agents can see in real time what each of the other agents in the system is working on.

**Third layer: reasoning memory (decision traces)**

At the apex of the context graph sit the decision traces, which capture the agent's internal decision-making processes and its historical record of decisions. Once a decision is made, it—together with the reasoning and the tools used—is stored as a 'decision trace.' Through self-referential activity, the agent can link knowledge and conversations to its own traces to make better decisions and improve its reasoning capacity over time. This layer also provides transparency and explainability for auditing, both by humans and by other agents.

**Why graphs and not conventional databases**

Webber acknowledges that the obvious question is whether this already existed in the form of traditional graph databases. His answer is affirmative, with a nuance: the three layers can be implemented with a graph database and queried with languages such as Cypher or GQL, but the current trend is moving toward APIs—such as the Neo4j Agent Memory API—that encapsulate the queries and offer useful functionality such as entity resolution on behalf of the agent, while keeping the data well curated. With these APIs, an agent can simply 'remember' a fact (including entities and their associations) without needing to write explicit queries.

**Neo4j Agent Memory: the practical implementation**

Neo4j has released Neo4j Agent Memory as an open-source library that runs on any Neo4j instance and packages the three types of context graph memory. The library handles graph schema modeling, the writing of Cypher queries, entity resolution, the addition of connections, summaries and metadata, and the maintenance of clean, well-curated data. It integrates with the main agent frameworks: LangChain, Pydantic AI, LlamaIndex, CrewAI and OpenAI Agents, which makes it possible to add a context graph to an already-built agent without rewriting the architecture. There is also a `create-context-graph` utility to build a complete agentic AI application from scratch, with a backend, a frontend with graph visualization and a domain context graph.

**Implications for agentic AI in production**

Neo4j's proposal tackles a real and documented problem in agentic systems: the lack of structured persistence of reasoning causes failures in long, multi-step tasks. The separation into three layers—stable knowledge, volatile context and decision traces—makes it possible to scale and process each component independently according to its temporal and functional needs. This is especially relevant for enterprise use cases where explainability and auditing are non-negotiable requirements: if the agent can show what decision it made, why and with what tools, operational trust increases significantly.

As sector context, memory management in AI agents is one of the most active problems in research and production in 2025-2026. Projects such as MemGPT (now Letta), LangGraph's memory systems, or OpenAI's own Memory for ChatGPT address parts of the same problem. Neo4j's differentiation lies in the use of the graph as a native data structure to represent not only facts, but also relationships between decisions, conversations and entities—something that approaches based on vector embeddings or relational databases do not capture naturally.

**Risks and opportunities**

Neo4j's bet positions the company at the center of enterprise agent infrastructure, a market growing at an accelerated pace. The main risk is operational complexity: maintaining three interconnected, well-curated graphs requires data discipline that many organizations do not yet have. The opportunity, however, is clear: agents that can explain their decisions, recover from errors and improve over time are the difference between a prototype and a reliable production system. The openness of the code (Neo4j Agent Memory is open source) and the integration with standard frameworks of the ecosystem are signs that Neo4j is betting on broad adoption rather than lock-in.

**Regulatory perspective**

From the perspective of the EU AI Act, the context graph's decision-trace layer is especially relevant for high-risk AI systems, where explainability and auditability are mandatory requirements. An architecture that systematically records what the agent decided, why and with what information facilitates regulatory compliance natively, without the need to add external logging layers after the fact.

**Forward look**

The three-layer model Neo4j proposes could become a reference pattern for the industry if the community adopts it through the most popular agent frameworks. The integration with MCP (Model Context Protocol)—mentioned in the article—further suggests that the proposal is aligned with the emerging standards for interoperability between agents and tools. If the concept of the context graph gains traction, Neo4j would be well positioned as the de facto knowledge layer for next-generation enterprise agents.

Sources & references

Neo4j — Why AI agents need three types of memory: Neo4j's context graphs