Achieving Persistent Memory Across AI Agents with Hooks and Neo4j

By • min read

In the rapidly evolving landscape of AI coding assistants, developers increasingly rely on tools like Claude Code, Codex, and Cursor to streamline workflows. However, a common challenge emerges: each agent operates in isolation, lacking a shared, persistent memory. This fragmentation forces users to repeat context and lose progress when switching between harnesses. Enter hook implementation—a clever architectural pattern that leverages Neo4j as a graph database to create a unified agentic memory layer. By attaching hooks to these tools, you enable them to read from and write to a central knowledge store without locking into any single ecosystem. This Q&A explores how hooks work, the role of Neo4j, and the practical benefits of this approach.

# What exactly is unified agentic memory, and why does it matter?

Unified agentic memory refers to a shared, persistent knowledge layer that multiple AI agents—such as Claude Code, Codex, and Cursor—can access and update. Instead of each agent maintaining its own fleeting context window, they all read from and write to a common graph database (like Neo4j). This matters because it eliminates the need to manually re-explain project history, code decisions, or dependencies every time you switch agents. For example, if Claude Code identifies a bug and records the fix in the memory store, Cursor can later pull that information when working on related code. The outcome is a seamless, collaborative experience where agents build on each other's work, drastically reducing redundancy and errors. This persistent memory transforms isolated tools into a cohesive, intelligent system.

Achieving Persistent Memory Across AI Agents with Hooks and Neo4j — Source: towardsdatascience.com

# How do hooks enable persistent memory across different AI harnesses?

Hooks are lightweight callbacks or interception points that integrate directly into the execution pipeline of AI coding assistants. They allow you to inject custom logic before or after key events—like code generation, file analysis, or user queries. In the context of unified memory, hooks are programmed to interact with a Neo4j database: on receiving a response from the agent, a hook extracts relevant entities, relationships, and context, then stores them as nodes and edges in the graph. Conversely, before the agent processes a new prompt, another hook retrieves pertinent past memories from Neo4j and injects them into the prompt. This two-way flow ensures that every agent benefits from collective knowledge, all without modifying the agent's core code. The hook architecture is flexible enough to work with disparate APIs, meaning it can be adapted to Claude Code, Codex, Cursor, or any similar tool.

# Why choose Neo4j as the underlying graph database for this memory architecture?

Neo4j is a natural fit because memory for coding agents is inherently graph-like: code modules, functions, variables, bugs, and documentation are all interconnected. A graph database allows you to model these relationships explicitly—e.g., a bug (node) is fixed by a commit (node) that modifies a function (node). This structure makes retrieval semantically rich: instead of simple keyword lookups, you can traverse relationships to find contextually relevant information. Neo4j's property graph model also supports flexible schemas, so you can evolve your memory schema as your project grows. Additionally, its built-in support for vector embeddings (via the Graph Data Science library) enables similarity searches, which is useful for fuzzy retrieval of past conversations or code snippets. Its robustness and transactional guarantees ensure that memory writes are consistent, even when multiple agents access the store concurrently.

# How does this approach prevent vendor lock-in while still leveraging proprietary AI tools?

The key is decoupling memory from the agents themselves. By using hooks as intermediaries, the memory store (Neo4j) becomes independent of which AI assistant you use. You are free to switch from Claude Code to Codex or Cursor, or even run multiple at the same time, because all agents interact with the same graph through standardized hook interfaces. The hooks abstract away the specifics of each tool's API and prompt format. For instance, a hook for Claude Code will parse its JSON response into a canonical memory representation, while a hook for Codex might handle a different response structure—but both write to the same Neo4j schema. This means if tomorrow a new superior agent appears, you simply write a new hook for it, and it immediately gains access to all existing project memory. Your investment is in the memory architecture, not in any single tool, reducing risk and increasing long-term flexibility.

# What are the practical steps to implement hooks for Claude Code, Codex, and Cursor?

Implementation typically involves three phases: 1. Set up Neo4j—deploy a local or cloud instance and define a memory schema (e.g., nodes for concept, code snippet, decision; edges for refers_to, implements). 2. Build hooks—for each agent, create a small service (e.g., a Python script or Node.js function) that listens to the agent's output via its API or webhook endpoint. On receiving output, the hook extracts key entities (using simple regex or an LLM-based extractor) and writes them to Neo4j using a driver. 3. Inject context—before sending a prompt to the agent, the hook queries Neo4j for relevant context based on the current task (e.g., related files or past fixes) and prepends that as system messages. Tools like LangChain or custom middleware can streamline this. For detailed guidance, refer to the official Neo4j documentation and examples in the original post.

# What real-world benefits can teams expect from adopting unified agentic memory?

Teams using this approach report several concrete advantages: Reduced context switching overhead—no need to re-explain the project to each new agent session. Improved code consistency—because all agents share the same memory of architecture decisions and coding conventions. Faster debugging—when one agent discovers a root cause, others can immediately act on that insight. Enhanced collaboration—multiple developers, each using their preferred agent, contribute to a shared knowledge base. Over time, the memory becomes a rich knowledge graph of the entire codebase, including undocumented rationale, that can be queried by both humans and agents. This turns a set of isolated AI tools into a synergistic system, boosting overall productivity and reducing project friction.

# Are there any limitations or best practices to consider when using hooks for memory?

Yes, a few important points: Security and privacy—ensure that hooks sanitize data before writing to the graph, especially if sensitive code or proprietary information is involved. Memory bloat—without curation, the graph can grow indefinitely; implement periodic summarization or decay (e.g., removing less important nodes). Latency—each hook call adds overhead; batch writes and asynchronous processing can mitigate this. Schema design—invest time upfront in a flexible graph schema that can evolve with your project. Also, treat the hooks as stateless where possible to simplify scaling. For teams just starting, it's recommended to begin with a limited scope (e.g., only storing major decisions and bugs) and expand gradually based on observed benefits.