Paper 3 · ArgosBrain research

Zero-Cost Graph Retrieval at Compiler-Grade Depth for AI Coding Agents

We describe Neurogenesis, a graph-first code-memory engine that answers structural retrieval queries for AI coding agents without any LLM call on the read path. The engine ingests source code into a canonical-identifier graph via a tiered pipeline that selects the highest-precision indexing technology available per language — compiler-grade SCIP indexers where mature, live language-server workspaces where not, and bespoke tree-sitter semantic walkers for the long-tail remainder. The retrieval API exposes structural primitives — symbol existence, member resolution, containment enumeration, call-graph traversal, override resolution — directly as deterministic graph operations. File-hash content detection invalidates stale subgraphs on source-tree changes, making re-ingest cost linear in the number of changed files rather than the repository size. Ingest operates in isolated subprocesses with bounded lifetimes; a crashing language server cannot affect the retrieval hot path, which is in-process Rust reading from local bincode-serialised graph storage. We report P99 retrieval latency at or below a single millisecond across 16 benchmark corpora, memory footprint in low hundreds of megabytes for repositories of several hundred thousand symbols, and zero monetary cost per thousand retrieval queries. We discuss the design-space alternatives rejected and limitations that remain.

Author
Aurelian Jibleanu
Affiliation
Neurogenesis
Date
April 21, 2026
arXiv
cs.SE / cs.PL
License
CC BY 4.0
Keywords
code memory, graph retrieval, SCIP, language servers, tree-sitter, MCP, AI coding agents

Abstract

We describe Neurogenesis, a graph-first code-memory engine that answers structural retrieval queries for AI coding agents without any LLM call on the read path. The engine ingests source code into a canonical-identifier graph via a tiered pipeline that selects the highest-precision indexing technology available per language — compiler-grade SCIP indexers where mature, live language-server workspaces where not, and bespoke tree-sitter semantic walkers for the long-tail remainder. The retrieval API exposes structural primitives — symbol existence, member resolution, containment enumeration, call-graph traversal, override resolution — directly as deterministic graph operations. File-hash content detection invalidates stale subgraphs on source-tree changes, making re-ingest cost linear in the number of changed files rather than the repository size. Ingest operates in isolated subprocesses with bounded lifetimes; a crashing language server cannot affect the retrieval hot path, which is in-process Rust reading from local bincode-serialised graph storage. We report P99 retrieval latency at or below a single millisecond across 16 benchmark corpora, memory footprint in low hundreds of megabytes for repositories of several hundred thousand symbols, and zero monetary cost per thousand retrieval queries. We discuss the design-space alternatives rejected and limitations that remain.

Figure 1: Neurogenesis component block diagram Architecture diagram showing source repository flowing through a tiered ingest (SCIP, LSP, tree-sitter) into a canonical-identifier graph. A separate file-hash invalidation block feeds the graph. The graph is read by an in-process retrieval API, which is exposed to an agent via MCP stdio transport. A side box labelled Subprocess isolation contains the tiered ingest. 01 input Source repo 02 tiered ingest SCIP indexers Language servers (LSP) Tree-sitter walkers 03 graph Canonical-identifier graph Subprocess isolation · bounded lifetime 04 invalidation File-hash staleness 05 retrieval In-process retrieval API 06 transport MCP stdio 07 client Agent Hot path (02 → 07): zero LLM calls. Ingest runs out-of-process so a crashing language server cannot affect retrieval.
Figure 1 — Neurogenesis component block diagram. A tiered ingest pipeline (running inside bounded-lifetime subprocesses) builds a canonical-identifier graph; file-hash invalidation keeps it current. The in-process retrieval API serves any MCP-compatible agent over stdio, with no LLM on the read path.

Introduction

AI coding agents that persist knowledge between sessions need a memory layer whose cost, latency, and accuracy match the expectations of interactive developer work. Three cost dimensions matter: dollars per query (charged by embedding or LLM API calls on the retrieval path), milliseconds per query (P99 matters more than P50 for interactive use), and staleness after source-tree changes (a refactor that renames several hundred symbols should not require re-ingesting the entire repository). Existing general-purpose memory systems for agents typically optimise one dimension at the expense of the others: dollar-cheap retrieval at the cost of LLM calls on writes; fast retrieval at the cost of accuracy on structural code queries; accurate retrieval at the cost of expensive re-ingestion.

This paper describes Neurogenesis, a memory engine specifically designed for the code-memory workload identified in companion work [Jibleanu, 2026a; Jibleanu, 2026b]. Neurogenesis optimises for the structural-query-dominated distribution coding agents actually issue, and accepts the corresponding design constraints: a graph-first storage layer, compiler-grade ingest where possible, a zero-LLM hot path, and content-hash-based incremental updates. The engine serves as the reference adapter in the LongMemCode benchmark [Jibleanu, 2026a] and is the subject of the measurements reported there.

The contributions of this paper are: (1) a high-level architecture description of a tiered, graph-first, in-process code-memory engine; (2) a justification of each design choice against alternatives, grounded in the structural-versus-semantic taxonomy [Jibleanu, 2026b]; (3) measured operational properties — latency, footprint, re-ingest cost — for the engine running against real open-source corpora; and (4) an explicit discussion of design-space limits and open problems.

4.1 Knowledge-graph memory for agents

Graphiti [Rasmy et al., 2025] and MemGPT / Letta [Packer et al., 2023] are the dominant graph-based agent-memory systems in production use. Both treat memory as a temporal knowledge graph of entities and labelled relations, extracted via LLM from conversational or documentary input. Graphiti requires an external graph database (Neo4j, FalkorDB, Kuzu, or Neptune); Letta maintains a tiered core/archival/recall structure edited by agent self-calls. Both pay LLM cost on write and, in Letta’s case, on read as well. Neither ingests source code as canonical-identifier graphs, and neither exposes structural-code-query primitives.

4.2 Retrieval-augmented code completion

Continue’s @codebase [Continue, 2025] parses source with tree-sitter, embeds top-level function and class bodies, and retrieves top-k chunks on demand. The chunks are text; the retrieval is semantic. Aider’s repository map [Aider, 2023] extracts tree-sitter symbols and ranks files by PageRank over reference edges, injecting the top-ranked identifiers into every prompt. Neither system builds a traversable graph of canonical identifiers, and neither supports queries such as “who overrides method m” or “which callers of function f” without fallback to text search.

4.3 Industrial code indexers

SCIP [Sourcegraph, 2023] is an open-source protocol for representing source-code indexing data. SCIP indexers exist for Rust (via rust-analyzer), Python (via a patched pyright), Go (via scip-go), TypeScript / JavaScript (scip-typescript), Java and Scala (via semanticdb and scip-java), PHP (scip-php), Ruby (scip-ruby), C# (scip-dotnet), and Dart (scip_dart). Sourcegraph uses SCIP to power cross-repository code search across billions of lines of code. SCIP is an ingestion format; it is not a memory engine, nor does it expose retrieval APIs designed for agent consumption. Neurogenesis consumes SCIP as one of its ingest backends, alongside others.

4.4 Language-server protocol indices

The Language Server Protocol [Microsoft, 2016] provides textDocument/documentSymbol and workspace/symbol as primitives that can be used to enumerate symbols in a workspace. Some language ecosystems (Kotlin, Swift) have mature language servers but no production-ready SCIP indexer. We use live LSP ingest opportunistically in those cases.

4.5 Tree-sitter-based semantic extraction

Tree-sitter [Brunsfeld, 2018] is an incremental parser-generator framework with grammars for over 100 languages. It produces concrete syntax trees; it does not perform cross-file symbol resolution, type inference, or import resolution. Using tree-sitter for semantic extraction requires per-language walker logic that maps CST nodes to canonical identifiers — a substantial engineering effort per language but the only option for languages without mature SCIP or LSP support.

Design Goals

Neurogenesis is designed against four explicit goals.

G1. Structural correctness at compiler-grade depth. For every language in the target set, structural queries — does this symbol exist, list methods of a class, enumerate overrides — must return exact, reproducible answers. This rules out approximate retrieval on the structural path.

G2. Sub-millisecond P99 retrieval at laptop resource budget. Interactive coding UX lives at the tail. A memory layer that serves an agent mid-task cannot pause the user. Retrieval must be graph-local and in-process; retrieval cannot call out to external services or spawn subprocesses per query.

G3. Zero monetary cost on the retrieval path, forever. The read path must never call an LLM, never call an embedding API, never make a network request. This constrains the storage model (all structure must be pre-computed at ingest time) but removes an entire class of operational failure modes.

G4. Re-ingest cost linear in the diff, not in the repository. Developer workflows issue branch switches, rebases, and partial edits constantly. A memory engine whose ingest cost is proportional to the repository size creates back-pressure on normal git operation. Re-ingest must be O(changed files).

These four goals constrain the design space severely. Most commercial agent-memory products satisfy two or three; we argue Neurogenesis is among the first to satisfy all four on the code-memory workload, at the cost of narrowing the target domain from general memory to code specifically.

Architecture

6.1 Components

Neurogenesis consists of three components, connected in a pipeline:

  • Ingest pipeline: consumes a source-tree commit SHA and produces a canonical-identifier graph persisted to local on-disk storage.
  • Graph store: on-disk bincode-serialised graph, with an in-memory working set for query serving.
  • Retrieval API: exposes the structural query primitives over a stable protocol (MCP stdio for the production deployment, but the API surface is transport-independent).

A persistent file-watcher component is optional and handles the O(changed files) incremental update path.

Figure 1 in this paper shows these components as a block diagram at a level of detail that illustrates the architecture without revealing internal types.

6.2 Tiered ingest pipeline

The ingest pipeline selects one of three backend strategies per language, chosen to maximise structural precision given the tooling available for that language.

Tier 1 — Compiler-grade SCIP indexing. For languages with a mature SCIP indexer, ingest drives the indexer against the source tree. The indexer runs the language’s compiler frontend and produces a SCIP index containing canonical symbol IDs, cross-file references, containment relations, and type information. Neurogenesis parses the SCIP index and inserts its nodes and edges directly into the graph. The indexer subprocess terminates at the end of ingest; no long-lived process is required.

Tier 2 — Live language-server ingest. For languages with a mature language server but no production SCIP indexer, ingest drives the language server over the LSP protocol. The workspace is opened, documentSymbol and workspace/symbol queries enumerate the symbols, and per-language post-processing maps LSP symbol kinds back to our canonical schema. Ingest is guarded by per-file and per-session timeouts, and the language server runs as an isolated subprocess whose lifetime is bounded by the ingest run.

Tier 3 — Bespoke tree-sitter semantic walkers. For languages without either a SCIP indexer or a mature language server, ingest uses tree-sitter grammars augmented with per-language semantic hooks that extract canonical identifiers from the concrete syntax tree. These walkers encode language-specific structural patterns — for example, languages where functions are defined by assignment rather than declaration require recognising assignment-to-function as a function-declaration event; statement-based grammars require per-statement parsing with context preservation; languages with block-label semantics require label-aware walkers. The walkers do not perform cross-file type inference, so their output is structurally shallower than Tier 1 but considerably richer than a generic tree-sitter surface extraction.

The tier is selected at build time based on the source tree’s detected languages. A single ingest run may use all three tiers in parallel across different file subsets of the same repository.

6.3 Graph storage

The graph is a set of nodes representing canonical identifiers and a set of labelled edges representing structural relations between them. Edges carry a label from a fixed schema — containment, reference, inheritance, override, and similar — derived from the tier’s source index.

The graph is persisted on disk in a compact binary serialisation. The hot working set is mapped into memory at retrieval-server startup; cold portions are spilled to disk with an LRU-like policy. Retrieval does not allocate on the typical path: a query walks pre-materialised edges in memory and returns a set of canonical-identifier strings.

6.4 Retrieval API

The retrieval API exposes structural primitives as named operations over the graph. The operation surface in the production deployment includes symbol existence checks, member resolution, containment enumeration, caller enumeration, override enumeration, and a small number of convenience operations for common agent workflows. Each operation translates to a deterministic graph query with predictable latency profile.

The API is transport-independent: the same operations are exposed over MCP stdio for IDE integration and over an in-process Rust interface for embedded use.

6.5 Staleness and incremental update

On ingest, every file carries a content hash (a collision-resistant hash of the file bytes) stored alongside its canonical-identifier nodes. On re-ingest, each source file’s current hash is compared against the stored hash; files whose hashes match are skipped entirely without parsing. Files whose hashes differ have their existing subgraph removed and rebuilt. The cost of re-ingest is therefore proportional to the number of changed files, not the repository size.

An optional file-watcher component observes the source tree between ingest runs and updates the graph incrementally on save events. The watcher is guarded by directory skip-lists (excluding build output and dependency folders), debouncing (to fold rapid sequences of save events from editors using atomic-save patterns), and per-subtree rate limits (to prevent runaway processes from wedging the host). Watcher operation is opt-in; the pull-based ingest path remains the correctness path.

6.6 Subprocess isolation and zero-panic guarantees

All external processes — SCIP indexers, language servers, tree-sitter walker invocations — run as operating-system subprocesses with explicit lifetime bounds. When the ingest run ends, subprocesses are killed. A subprocess crash surfaces as a Result::Err in the Rust parent; it cannot propagate as a panic into the retrieval path.

The retrieval hot path — the MCP stdio loop that serves the agent — is written without unwrap() in library code. Every fallible operation returns Result. The retrieval path never spawns subprocesses, never performs I/O beyond reading from the local graph store, and holds no locks that an ingest path holds. Ingest and retrieval are independent execution domains that share the graph through a controlled write-snapshot protocol.

6.7 Block diagram

Figure 1 — Component block diagram.

A simple block diagram. Three rows. Top row: source tree on the left, tiered ingest pipeline (three boxes labelled Tier 1 SCIP, Tier 2 LSP, Tier 3 tree-sitter) in the middle, arrow to the right. Middle row: graph store as a single cylinder, in-memory working set above it. Bottom row: retrieval API as a box at right, MCP stdio as the transport on the far right, agent symbol on the right edge. No internal types, no parameters, no specific languages labelled against tiers. What this figure shows: how the three components connect. What it deliberately does not show: internal storage layout, specific parameter values, per-language tier assignments, or any detail that would enable implementation replication.

Engineering Properties

7.1 Measured latency

Retrieval P99 latency across the 16 corpora of LongMemCode is at or below 0.82 milliseconds in the worst case, and below 0.1 milliseconds for the majority of categories. Latency is dominated by the cost of edge traversal plus result serialisation; there is no component of the retrieval path that scales with repository size given a bounded result set. Figure 2 shows the full cumulative distribution function of per-query latency across the benchmark.

Figure 2 — Per-query latency CDF across LongMemCode.

A cumulative distribution function chart. X-axis: per-query latency in milliseconds, log scale. Y-axis: fraction of queries at or below that latency, from 0 to 1. A single curve representing the flat union of per-query timings across 16 corpora and all nine categories. Source data: LongMemCode run JSONL files at the submission commit. What this figure shows: the latency distribution has no long tail — the curve reaches the top within two orders of magnitude of the median. What it deliberately does not show: per-corpus breakdown, or any architectural attribution for why the tail is short.

7.2 Measured memory footprint

Memory footprint, measured as resident-set size during steady-state query serving, is in the low hundreds of megabytes for repositories of several hundred thousand symbols. Footprint scales approximately linearly with the number of stored nodes and edges, with a constant factor set by the serialisation format and the in-memory index structures.

Limits at extreme scale. The measurements in this paper cover repositories up to the scale of the largest corpora in LongMemCode (several hundred thousand symbols). We have not benchmarked repositories in the Linux-kernel or Chromium class (on the order of several million symbols). At that scale an all-in-memory graph would cross the tens-of-gigabytes threshold and become impractical on laptop-class hardware. The architecture anticipates this by leaving room for a tiered-storage layer: hot subgraphs remain in process memory, cold subgraphs spill to a local key-value store (SQLite, RocksDB, or LMDB are the obvious candidates). The retrieval API does not change — a cold-tier fetch becomes a hidden I/O inside a traversal step, with a latency tax that can be measured and reported per query class. We flag the tiered-storage extension here as a deliberate scope boundary rather than an oversight; every latency and footprint claim in the present paper is bounded to the measured scale.

7.3 Measured cost

Retrieval has no monetary cost per query. There is no LLM call, no embedding call, no external API call on the read path. The ingest cost is one-time per changed file: running the tier’s backend on the file, parsing its output, and inserting into the graph. Compilation or tree-sitter parsing cost is the dominant term.

Figure 3 — Cost per thousand retrieval queries, comparative.

A horizontal bar chart. Y-axis: systems (Neurogenesis / structural reference, plus placeholder bars for any other adapter present in LongMemCode at submission time). X-axis: cost in US dollars per 1 000 retrieval queries, log scale. Source data: Neurogenesis at $0 (measured, no LLM on read path); other systems inferred from their publicly documented pricing and the prompt tokens they inject per query (exact method described in the caption). What this figure shows: the architectural choice of zero-LLM retrieval produces an order-of-magnitude cost gap versus any system that injects retrieved content into an LLM prompt. What it deliberately does not show: internal explanation of how zero-LLM retrieval is achieved — that is the architecture itself.

7.4 Re-ingest cost

Re-ingest on a zero-diff source tree (no file content changes) completes in under five seconds for a large repository. Re-ingest after a three-hundred-file diff completes in a few seconds for compiler-grade-ingested languages and sub-second for tree-sitter-ingested languages. The cost is linear in the number of changed files.

7.5 Zero-panic property

The retrieval hot path has no unwrap() in library code; every fallible operation is threaded through Result. Ingest subprocesses cannot propagate panics into the retrieval path because they are separated by operating-system process boundaries. A malformed input file fails ingest for that file, logs a warning, and does not block the ingest run from completing or the retrieval server from serving previously-ingested queries.

Design-Space Alternatives

8.1 Vector-only storage

A vector-only memory engine would embed each code chunk and retrieve via similarity. This is the default paradigm in the LLM application layer. We reject it for Neurogenesis because the structural-query distribution [Jibleanu, 2026b] penalises it: vector retrieval cannot natively return empty sets for hallucinated identifiers, cannot enumerate overrides, cannot follow inheritance edges. A vector component is complementary to the graph — Neurogenesis can coexist with one — but it cannot replace the graph for structural workloads.

8.2 External graph database

An external graph database (Neo4j, FalkorDB, Kuzu, Neptune) is the path taken by Graphiti and Zep. We reject it because it violates Goal G2 (sub-millisecond P99 at laptop resource budget): network round-trip costs dominate graph-local traversal costs. In-process Rust storage gives us deterministic latency; database-backed storage does not.

8.3 LLM-in-the-loop on the retrieval path

Letta’s read path calls an LLM tool. MemGPT’s read path calls an LLM tool. We reject the pattern because it violates Goal G3 (zero monetary cost per query). A memory engine that charges per read scales its cost with agent usage; a memory engine that pre-computes its structure at ingest and serves from that structure does not.

8.4 Incremental indexer running against the source

Some industrial code-intelligence products operate an incremental indexer that continuously maintains an up-to-date index against the source tree. We reject the continuous path in favour of a content-hash pull model for two reasons: it simplifies operation (there is no daemon to monitor), and it makes the cost of re-ingest explicitly attributable rather than amortised into background CPU use.

8.5 Full tree-sitter everywhere

A simpler design would use tree-sitter for every language rather than a tiered pipeline. We reject it because tree-sitter produces surface syntax and does not perform cross-file symbol resolution. The head of the language distribution — where most real code is written — has compiler-grade indexing available, and using it produces substantially richer graphs. The tiered approach pays extra engineering cost upfront to hit the richer indexing when available.

Limitations

9.1 Tier coverage is uneven

Tier 1 (compiler-grade SCIP) covers a subset of the languages we target. Tier 2 (live LSP) covers languages with mature language servers but no SCIP indexer. Tier 3 (tree-sitter walkers) covers the remainder. The structural richness of the resulting graph is correspondingly uneven: a refactor-audit query on a Tier 1 language is backed by cross-file type resolution; the same query on a Tier 3 language is backed by syntactic inference with known gaps. Users working primarily in Tier 3 languages will see a larger residual gap between Neurogenesis and a hypothetical perfect indexer than users working in Tier 1 languages.

9.2 Ingest is not instant

The content-hash skip makes re-ingest O(changed files), but the first-time ingest of a repository pays the full cost of running every file’s tier backend. On large repositories, first-time ingest can take minutes. We consider this acceptable — it amortises across sessions — but we name it explicitly.

9.3 Tier 2 inherits language-server variance

Live LSP ingest is gated by language-server quality. Language servers are notorious for memory leaks, crashes, and workspace-load latency variance. The subprocess-isolation and timeout model [Section 4.6] bounds the blast radius, but does not eliminate it: an ingest run against an uncooperative language server takes longer or fails on that language specifically, without affecting retrieval availability for already-ingested data.

9.4 Semantic queries require additional infrastructure

As argued in companion work [Jibleanu, 2026b], structural and semantic queries are distinct. Neurogenesis in its currently described form handles structural queries. A complete production memory layer for coding agents benefits from a companion semantic-retrieval component, which can share the same ingest pass but uses an embedding index alongside the graph. We do not describe such a companion in this paper.

9.5 Team sync is unimplemented

Neurogenesis is local-first by design: ingest, storage, and retrieval all happen in-process. Multi-user team memory with shared indices and synchronisation across user accounts is not implemented. Enterprises with these requirements today should use conversational-memory products with team-sync support; we discuss this gap as future work.

We conclude by situating Neurogenesis against adjacent systems along the four design goals. Table 1 summarises the comparison.

System G1 Structural correctness G2 Sub-ms P99 G3 Zero-cost read G4 O(diff) re-ingest
Neurogenesis Yes (tiered ingest) Yes Yes Yes (content-hash)
Graphiti / Zep Partial (entity extraction via LLM) No (external DB) Yes (free traversal) No
Mem0 No (semantic only) Yes Partial No
Letta No (agent-driven text memory) No (LLM in read) No (LLM in read) No
Cursor Memories No (prompt injection) N/A (not a retrieval query) No (prompt tokens) No
Continue @codebase Partial (tree-sitter chunks) No (embedding lookup) No (prompt tokens) No
Aider repo map Partial (surface names + PageRank) N/A (re-computed per request) No (1 000 tokens / req) N/A (stateless)

Conclusion and Future Work

We have described Neurogenesis, a graph-first code-memory engine designed for AI coding agents on inner-loop workloads. The engine satisfies four design goals simultaneously — structural correctness at compiler-grade depth, sub-millisecond P99 retrieval, zero monetary cost per read, and O(changed files) re-ingest — by choosing a tiered ingest pipeline, in-process Rust graph storage, and a retrieval API that exposes structural graph primitives directly. Measured latency on the LongMemCode benchmark is sub-millisecond P99 across all corpora tested; memory footprint is bounded by the graph size and fits in laptop resource budgets for realistic repositories.

Future work falls into three branches. First, expanding tier coverage, particularly Tier 1 (SCIP) support for additional languages as upstream indexers mature. Second, companion semantic-retrieval infrastructure that shares ingest with the graph and addresses the non-structural portion of the coding-agent query distribution. Third, team-sync and multi-user deployment patterns that preserve the local-first operational model while allowing shared indices for collaborative workflows.

References

@inproceedings{aider2023repomap,
  title={Repository Map: Scaling to Large Codebases with Tree-sitter and PageRank},
  author={Aider Team},
  year={2023},
  url={https://aider.chat/2023/10/22/repomap.html}
}

@inproceedings{brunsfeld2018treesitter,
  title={Tree-sitter: An Incremental Parsing System for Programming Tools},
  author={Brunsfeld, Max},
  year={2018},
  url={https://tree-sitter.github.io/tree-sitter/}
}

@inproceedings{chhikara2024mem0,
  title={Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory},
  author={Chhikara, Prateek and others},
  booktitle={arXiv preprint arXiv:2504.19413},
  year={2025}
}

@inproceedings{continue2025codebase,
  title={@codebase Retrieval Architecture},
  author={Continue Dev Team},
  year={2025},
  url={https://docs.continue.dev/customize/deep-dives/codebase}
}

@misc{jibleanu2026longmemcode,
  title={LongMemCode: A Deterministic Benchmark for Code-Memory in AI Agents},
  author={Jibleanu, Aurelian},
  year={2026},
  note={Companion paper and MIT-licensed benchmark repository}
}

@misc{jibleanu2026taxonomy,
  title={Structural vs Semantic Retrieval in Code-Memory: A Query-Type Taxonomy},
  author={Jibleanu, Aurelian},
  year={2026},
  note={Companion paper}
}

@inproceedings{microsoft2016lsp,
  title={Language Server Protocol Specification},
  author={Microsoft},
  year={2016},
  url={https://microsoft.github.io/language-server-protocol/}
}

@inproceedings{packer2023memgpt,
  title={MemGPT: Towards LLMs as Operating Systems},
  author={Packer, Charles and others},
  booktitle={arXiv preprint arXiv:2310.08560},
  year={2023}
}

@inproceedings{rasmy2025zep,
  title={Zep: A Temporal Knowledge Graph Architecture for Agent Memory},
  author={Rasmy, Preston and others},
  booktitle={arXiv preprint arXiv:2501.13956},
  year={2025}
}

@inproceedings{sourcegraph2023scip,
  title={SCIP: The Source Code Intelligence Protocol},
  author={Sourcegraph},
  year={2023},
  url={https://github.com/sourcegraph/scip}
}

Appendices

13.1 Appendix A — Protocol: ingest backend abstraction

Pseudocode interface for the abstract ingest backend (not the Rust trait definition — a simplified pseudocode that conveys the shape without disclosing the trait’s internals). One page. Signatures only. No implementation bodies.

13.2 Appendix B — Protocol: retrieval API surface

The MCP-exposed retrieval operations, with expected input and output shapes. This is already public in the MCP schema we ship, so it is safe to reproduce here. One page.

Related papers