Introduction

TL;DR
Deterministic protocol and architecture for reasoning agents.
Fixed, taxonomized graph + bounded vector acceleration.
Strong focus on long-term memory that can improve without breaking reliability.
Online: fast, deterministic inference on an immutable snapshot.
Offline: governed consolidation (memory) and structural evolution (graph), then publish.
Target: enterprise agents that must remain reliable as they learn.

The problem we are solving

Agents that rely solely on LLM generation and/or vector-only RAG typically fail in four ways:

non-governable decisions (implicit logic, hard to audit),
unstable memory (redundant or contradictory facts),
structure drift over time (behavior changes without controlled rollout),
false quality signals driven by user silence.

The main risk is building a system that “often works” but is not reliable, not auditable and not safe to evolve.

What ARG is (in plain terms)

This project defines an architecture and protocol for agents with a strong focus on:

long-term memory,
adaptive, evolving memory (able to refine and consolidate over time),
robust decision-making for automation agents.

The goal is not to build yet another enhanced RAG pipeline but to provide a coherent protocol that allows agents to reason, act, remember AND evolve without sacrificing reliability, governance or auditability.

A few terms used throughout this document:

Ontology: the operational “things that exist” in your agent world (concepts, actions, constraints) and how they are represented.
Taxonomy: a hierarchical classification (clusters/labels) used to constrain routing and keep decisions coherent.
Graph: a network of nodes (terminal units of info/decision/action) connected by edges (explicit relationships that define allowed movement).

Core idea

We combine two complementary strengths:

the stability and predictable scalability of explicit graph structure,
with the flexibility and speed of vector retrieval.

In this design:

taxonomy + graph provide durable structure and deterministic navigation,
vector search accelerates detection, classification, and local retrieval.

Crucially, vector systems are treated as approximators not as truth-makers.

During online execution, the reasoning structure is treated as fixed:
no nodes, labels or edges are created or modified in the hot path.
Adaptation happens offline: episodic writes are consolidated, then governed changes are published into a new immutable snapshot.

ARG: Online protocol (immutable snapshot) and offline evolution (consolidation + publish)

Figure 1 – Global overview: online runtime uses an immutable snapshot (read-only) and emits append-only episodic writes; offline consolidates and curates, then publishes a new snapshot back to online.

Scope

This protocol covers:

An operational ontology
- structured through a taxonomy (clusters/labels),
- and explicit relationships (edges),
- that constrain navigation and decision-making.
A “leaf-oriented” reasoning graph
- where nodes are terminal units of information, decision or action,
- and edges form the grammar of movement between these units.
A two-speed memory system
- online episodic memory (capturing events and weak signals),
- offline semantic consolidation (controlled promotion into stable knowledge).
Fast classification and routing
- driven by lexical and vector signals,
- but strictly bounded by the taxonomy.

This framework targets enterprise-grade agents that require traceability and safe evolution across support, operations, compliance, and workflow automation.

Why not rely on a vector database alone?

Vector retrieval is excellent for fast semantic matching but it is not a safe long-term substitute for structure.

As the number of indexed items grows inside a shared latent space:

semantic neighborhoods become noisier,
disambiguation becomes harder,
the system accumulates near-duplicates,
and memory quality degrades unless strong constraints exist.

Therefore, the architecture intentionally avoids infinite, unconstrained vector growth.
Instead, it uses:

a bounded taxonomy-aware label space enforced by the Context Weaver,
deterministic validation of allowed labels and paths,
and offline lifecycle rules for structural evolution.

Why not rely on a knowledge graph alone?

A knowledge graph provides durable structure, explicit relationships and strong auditability.
However, a graph-only approach is often too slow and too rigid for real-world agent routing and intent capture at scale.

Typical limitations of a graph-only stack include:

slower or more expensive early-stage intent detection without a semantic shortcut layer,
higher manual modeling cost to cover long-tail phrasing and evolving user language,
weaker performance on fuzzy matching when the taxonomy is still young,
more friction for cold-start scenarios.

This is why the architecture combines explicit structure with a bounded, taxonomy-aware semantic accelerator (see Context Weaver). The vector layer remains an approximator, not a structure-definer, and its growth is governed by taxonomy constraints and offline evolution rules.

Design principles

This project is built on a few non-negotiable principles:

Adaptability
The environment evolves; the agent must keep improving as reality changes.
Safety & reliability
The system evolves but never “at random”: changes are governed, reviewable, and aligned with a stable direction.
Reasoning grounded in a Knowledge Core
ARG defines a stable core of knowledge that reflects current values and the current environment, so the agent can create real value through actions and information.
Relevance & granularity
Better classification increases precision over time. Memory is curated: the system saves only what is relevant, rather than accumulating everything.
Memory is a controlled system, not a dump
Online writes are conservative and append-only; semantic promotion happens offline under governance.
Evolution is staged and versioned
Changes follow lifecycle rules (ACTIVE → DEPRECATED → REMOVED), with redirects/aliases to protect both reasoning and memory.

What you will find here

The ARG protocol (online inference + offline refinement).
The Context Weaver architecture for taxonomy-safe classification under tight latency budgets.
The Policy Manager kernel for pre/post governance.
A structured approach to memory write, deduplication and consolidation.
Guides for ontology construction, auditing, and safe lifecycle evolution.

If you are building agents that must remain reliable as domains, products, and user behavior evolve, this framework is designed to offer a practical middle path between:

purely probabilistic LLM reasoning,
and fully hand-engineered decision trees.

It aims to deliver fast online behavior,
with deterministic structure,
and safe long-term learning.

ARG Action space (agent execution)

In the rest of this document we will present ARG through three primary agent use cases: Action, Retrieval and Long-term Memory.

This figure is the reference layout for Action-capable agents in ARG, from policy pre-check to taxonomy-safe action routing, to episodic ActionWrites that drive offline evolution. In practice, product agents map their concrete tools / workflows onto Action nodes (or cold-start domain agents) and let ARG handle routing, safety, and logging. For implementation details and synchronization patterns between an agent runtime and this Action space, see Using ARG for action agents.

ARG: Action space & offline graph evolution

Figure 2 – Online Action path (structured vs cold-start domain actions) and offline consolidation into the ARG Core.

Figure 2 illustrates the Action path: an online, request-time agent loop where an incoming request is validated by the Policy Manager, classified by the Context Weaver into an in-domain ActionType and label set, then routed either to a structured Action node or to a cold-start domain agent when no dedicated node exists. In both cases, outcomes are recorded as episodic ActionWrites, which later feed the offline consolidation and graph-evolution loop that creates or refines Action nodes in the ARG Core.

ARG Retrieval space (context for reasoning)

Figure 3 illustrates the Retrieval space in ARG-Info. The same online/offline split applies: online, the graph is fixed and retrieval is strictly a read-only operation; offline, episodic reads and external bundles are consolidated into long-term structure. At request time, the Context Weaver classifies the query into three retrieval modes, user memory (facts already saved in the graph), domain / project knowledge (business- and platform-specific information), and external / fresh knowledge (news, new laws, new papers, etc.). User and domain retrieval are served from graph-based Info nodes, while external retrieval is routed via typed ExternalSource / LiveConnector nodes that describe how to access live systems without putting those facts directly into the graph. All three streams are merged into a bounded, taxonomy-aligned context for the LLM and only governed, episodic writes can later feed offline consolidation. For concrete patterns on wiring agents to this Retrieval space, see Using ARG for retrieval agents.

ARG: Retrieval space & offline knowledge evolution

Figure 3 – Online Retrieval space (user, domain, external) and offline consolidation into long-term knowledge in the ARG Core.

Effective retrieval in the Retrieval space (high level)

Figure 4 summarizes how retrieval behaves at request time inside the Retrieval space. After policy gating, the Context Weaver selects a retrieval mode (USER / DOMAIN / EXTERNAL) and returns a taxonomy-coherent label set L_final. USER and DOMAIN retrieval are served from graph-based Info nodes (read-only at request time), while EXTERNAL retrieval is routed through typed ExternalSource / LiveConnector nodes under strict policy limits (freshness, budget, allowed sources). The resulting internal and optional external streams are merged into a bounded, taxonomy-aligned context for the LLM, and the final answer is post-checked by policy. For concrete implementation patterns and agent wiring, see Using ARG for retrieval agents.

ARG: Effective retrieval process

Figure 4 – High-level effective retrieval: policy-gated routing across user, domain, and external streams, merged into a bounded context for reasoning.

Short-term and long-term memory (high level)

Figure 5 summarizes how memory behaves in ARG across the same online/offline split. Online, the system may maintain an ephemeral Working Set (IDs, bundles, L_final/typologies, cache, structured summary) to stay consistent even when a conversation exceeds the LLM context window. If a write-worthy delta is detected, a Policy-governed Memory Guard may allow an episodic Unit Memory Write (node + refined chunk with provenance and confidence). External results are logged as an External Bundle Log for audit, with no promotion online. Offline, episodic writes and external logs feed consolidation (deduplication, contradiction handling), controlled promotion to stable memory (SHADOW → ACTIVE), and versioning/migration (redirects, split mappings).

For concrete patterns and implementation guidance, see Using ARG for memory agents.

ARG: Short-term + Long-term Memory (High-level flow)

Figure 5 – Short-term + long-term memory: ephemeral working set and governed episodic writes online, consolidation and promotion offline.

Introduction ​

The problem we are solving ​

What ARG is (in plain terms) ​

Core idea ​

Scope ​

Why not rely on a vector database alone? ​

Why not rely on a knowledge graph alone? ​

Design principles ​

What you will find here ​

ARG Action space (agent execution) ​

ARG Retrieval space (context for reasoning) ​

Effective retrieval in the Retrieval space (high level) ​

Short-term and long-term memory (high level) ​