Using ARG for retrieval

ARG: Retrieval space & offline knowledge evolution

0 Objective of this guide

Audience

This guide is written for:

product builders designing agents that must fetch reliable context, not just generate text
platform and infrastructure teams wiring agents into enterprise knowledge sources
agent engineers who need predictable, auditable retrieval behavior

It assumes you already understand basic agent concepts and want a production-safe way to let agents retrieve and ground information beyond the LLM’s parameters.

What you are building

You are building an agent that knows how to ask for the right information before answering.

Retrieval is not free form.
It is executed through a structured Retrieval Space governed by:

policy constraints,
taxonomy validity,
deterministic routing rules.

Online behavior is read-only with respect to structure:

the agent retrieves from a fixed graph and typed connectors,
it never creates nodes, edges, labels, or clusters at request time,
external information is pulled through explicit ExternalSource / LiveConnector nodes,
external facts never become graph structure online.

Policy always comes first.
Taxonomy coherence is enforced before any retrieval decision.

This design is what allows ARG-based retrieval agents to ground their answers without corrupting long-term structure.

What you will be able to do by the end

After reading this guide you will know how to:

map user memory and domain/project knowledge to explicit Info nodes in the graph
route incoming queries into three retrieval modes:
- user memory already in the graph
- domain / project / platform knowledge
- external / fresh knowledge (news, laws, new papers, etc.)
use typed ExternalSource / LiveConnector nodes to access live systems safely
emit governed MemoryWrites and external bundles that can be reused by the offline evolution loop

The goal is not just retrieval.
The goal is retrieval that remains bounded, auditable, and safe to evolve.

Out of scope

This guide does not re-explain the full protocol.
Those details already exist and are referenced when needed.

Specifically out of scope:

the full inference and traversal mechanics
- see ARG Core
the internal design of taxonomy arbitration
- see Context Weaver
the complete governance kernel
- see Policy Manager

It also does not cover deep design of general domain taxonomies outside the retrieval use case.

1 Essential concepts

1.1 Retrieval agents in ARG

ARG keeps the same hard separation between what happens online and what evolves offline.

Online hot path:

the reasoning graph is treated as fixed during a request
no nodes, edges, labels, or clusters are created or modified
the agent may:
- read from Info nodes,
- call ExternalSource / LiveConnector nodes,
- write episodic logs and external bundles
this guarantees low latency, determinism, and auditability

This online contract is defined in:

Offline evolution loop:

episodic reads and external bundles are analyzed after the fact
new Info nodes, labels, edges, or connector refinements are proposed
changes are applied only through lifecycle rules and versioning

Offline refinement mechanics are defined in:

1.2 Retrieval primitives

To reason about retrieval safely, ARG uses explicit primitives.

RetrievalType

a top-level retrieval family:
- USER: user-level memory in the graph
- DOMAIN: business / platform / project knowledge
- EXTERNAL: fresh knowledge beyond the LLM’s parameters
selected early to separate retrieval behavior paths

Labels

taxonomy-coherent labels describing the information need more precisely
produced and validated by the Context Weaver
labels must always respect taxonomy structure
enforcement is handled by:

Info node

a structured, retrievable unit in the graph
binds a domain concept to normalized content:
- documentation snippets
- process descriptions
- user memory facts
- domain configurations (summarized)
exposes a read contract for retrieval behavior
routing and traversal rules are defined in:
- ARG Core Step 7 Traversal
- ARG Core Step 9 Info

ExternalSource / LiveConnector node

used only for EXTERNAL retrieval
describes how to access live or out-of-graph knowledge:
- web search
- APIs
- regulatory databases
- research feeds
never stores external facts as graph content
always constrained by policy rules from:

External context bundle

an ephemeral grounding object built from external retrieval:
- { source, timestamp, excerpt, score, url/id }
used only in the LLM context window
may be logged as an episode or audit artifact
cannot create nodes, labels, or edges online

ARG MemoryWrite (Info / Retrieval)

an episodic record emitted after a retrieval-heavy interaction
captures what was retrieved, from where, under which constraints
may include user-level or domain-level facts (when allowed)
never mutates the active graph online
later consumed by the offline consolidation loop:
- ARG Core Offline Loop

1.3 Vectors are approximators (for retrieval)

In ARG, vectors are used to make retrieval fast and scalable.
They are never allowed to define truth, authorization, or structure.

Vectors help with:

early classification during retrieval-type detection
- see ARG Core Step 2 Initial Classification
shortlisting candidate labels or Info nodes
local retrieval of relevant Info chunks once routing is constrained
ranking candidate external results inside an ExternalSource connector

Vectors never decide:

whether a retrieval is allowed (policy decides)
whether a fact becomes part of the graph structure
taxonomy validity or label legality
connector configuration or evolution

The vector layer is an accelerator.
It narrows the space of choice.
It never defines structure or policy.

Boundaries of vector usage inside taxonomy arbitration are defined in:

Policy gates that override any vector signal are defined in:

Policy Manager Pre-Check

Concrete examples

Bad pattern
An embedding is close to “latest EU AI Act” so the agent answers from its own parameters.

What is missing:

explicit external retrieval decision:
see ARG Core Step 3 Context Weaver
binding to an ExternalSource / LiveConnector node:
see ARG Core Step 9 Info
policy constraints on jurisdiction, freshness, and sources:
see Policy inputs and outputs

Bad pattern
An embedding is close to a config doc so the agent assumes the config and never checks the live system.

What is missing:

retrieval from current domain Info nodes and/or runtime state
policy awareness for production-critical domains
explicit separation between “reference doc” and “current live value”

Correct pattern:

embeddings shortlist candidate labels or Info / ExternalSource nodes only
see ARG Core Step 2 Initial Classification
Policy Manager gates whether external or sensitive retrieval is allowed
see ARG Core Step 1 Policy Pre-Check
Context Weaver produces taxonomy-coherent RetrievalType and labels
see ARG Core Step 3 Context Weaver
landing point selection and scoring apply hard gates and thresholds
see ARG Core Step 4 Landing Point
internal and external retrieval run under constraints and logging
see ARG Core Step 9 Info
only governed MemoryWrites and offline loops can change long-term structure
see ARG Core Offline Loop

2 Retrieval pipeline overview

This section summarizes how an information request flows through ARG.
Each step corresponds to a specific contract in the protocol.

2.1 Online request-time loop

1 Policy pre-check
The request is validated for scope, safety, and permissions before any retrieval decision.
This step can block, refocus, or constrain the query.
See ARG Core Step 1 Policy Pre-Check.

2 Context Weaver – retrieval classification
The request is classified into:

a RetrievalType: USER / DOMAIN / EXTERNAL
a taxonomy-coherent label set L_final

Invalid or incoherent labels are rejected.
See ARG Core Step 3 Context Weaver and Context Weaver (online pipeline).

3 Binding (choose the retrieval subspace)
Based on RetrievalType, validated labels, and constraints, the system binds retrieval to:

UserMemory Info nodes (user scope), or
Domain / Project Info nodes (business scope), or
a typed ExternalSource / LiveConnector node (external scope), or
a clarify / abstain path when retrieval would be unsafe.

For USER/DOMAIN, binding first restricts the search space by set-theory over taxonomy attachments:

where is the policy/tenant-visible node set and is the node set attached to the taxonomy region implied by L_final.

This keeps retrieval fast and taxonomy-aligned before any optional graph heuristics are applied.

4 Effective retrieval (under constraints)
Internal retrieval (USER / DOMAIN):

landing point selection within
neighbor scoring + bounded traversal (typically 2–3 hops)
stop + Info selection (bounded chunk extraction)

External retrieval (EXTERNAL):

executed only via declared connector tools
enforced policy limits:
- freshness window
- locale, tenant, and jurisdiction
- allowlists / blocklists
- max queries and max context size

See:

5 Merge retrieval contexts
Retrieved context is merged into a bounded, taxonomy-aligned bundle:

user memory
domain / project Info
optional external bundle

This is the context window passed to the LLM.
No graph structure is modified in this step.

ARG: Effective retrieval process

6 ARG-Info answer (LLM)
The LLM produces an answer grounded in the merged context.
Policy post-check may filter or redact the final answer.
See:

7 Optional MemoryWrite (Step 10)
The interaction may produce:

episodic user-level or domain-level facts
an external context bundle log for audit

Online rules:

only episodic writes are allowed
no semantic promotion of external facts into nodes or labels

See:

2.2 Offline consolidation and knowledge evolution loop

Offline processing improves the Retrieval Space without affecting online determinism.

1 Aggregate episodes and external bundles
MemoryWrites and external bundles are clustered to detect:

missing Info nodes
under-modeled taxonomic regions
unstable or noisy external connectors

Aggregation inputs are defined in:
ARG Core Offline Loop and ARG Core Offline Collection.

2 Taxonomy-guided candidates
Proposed nodes, labels, edges, and connector refinements are checked for taxonomy plausibility.
The Weaver qualifies but does not decide structure.
See:

3 Scoring, policy, and risk checks
Replay-based scoring separates strong from weak evidence:

stable recurring queries
high-value external sources
noisy patterns that should not be promoted

Policy and security constraints gate publishing.
See:

4 Publish to the ARG Core
Approved changes are:

versioned and released
applied via lifecycle rules (ACTIVE → DEPRECATED → REMOVED)

This is the only place where graph structure and connectors evolve.
See:

3 Conceptual prerequisites

3.1 Define the retrieval scope

Define clearly what kind of information the agent is allowed to retrieve.

Include:

domain boundaries (what is in scope vs out of scope)
forbidden categories (no personal data from X, no competitive intel from Y)
refocus behavior when the query is out of scope
jurisdictions (legal or regulatory domains)

Scope enforcement lives in the Policy Manager:

see Policy role
see PM 1 Pre-Check

Online scope gating is part of the request envelope:

see ARG Core Step 1 Policy Pre-Check

3.2 Define the knowledge taxonomy

The knowledge taxonomy is the backbone of retrieval.

Define:

label naming conventions for:
- user memory
- domain / project knowledge
- external domains (news, laws, research)
granularity rules:
- avoid over-fragmentation of Info nodes
- avoid “giant buckets” where everything collides
coherence rules for parent/child and sibling relationships
synonym and exclusion rules that prevent label collisions

Taxonomy coherence and validation are enforced by the Weaver:

see Context Weaver Online Pipeline
see OW 5 Validator

Taxonomy evolution must be offline:

3.3 Define Policy Manager governance for retrieval

Policy is the authorization layer for retrieval.

Define:

decisions:
- ALLOW
- ALLOW_WITH_REFOCUS
- RESTRICT
- BLOCK
RBAC/ABAC rules for:
- user-level memory access
- domain knowledge access
- external connector usage
budgets:
- number of retrieval calls
- external queries
- latency and cost caps
refusal and clarify rules when constraints prevent safe retrieval

Policy outputs and constraints are defined in:

4 Build the Retrieval Space

4.1 Info node model

An Info node is a retrieval contract.
It turns a labeled concept into a bounded, reusable chunk of knowledge.

Include in every Info node:

intent and description
RetrievalType and attached labels
content schema:
- normalized text or structured fields
freshness properties:
- how often it should be revalidated or regenerated
visibility and permissions:
- which roles/tenants can see it
observability:
- how often it is retrieved
- where it is used

Retrieval-time behavior for Info nodes is defined in:

4.2 Retrieval graph structure

The retrieval graph should make context building predictable and bounded.

Common edge patterns:

parent and child (concept hierarchies)
similar topic (for fallback retrieval)
prerequisite (what must be read first)
“see also” and “do not mix” relationships

Recommended structural patterns:

domain-oriented trees matching label hierarchies
hubs for important domains (e.g., compliance, billing, infrastructure)
explicit separation between:
- user-level memory
- domain docs
- external connectors

Traversal behavior and guardrails are defined in:

ARG Core Step 7 Traversal

4.3 ExternalSource / LiveConnector modeling

ExternalSource / LiveConnector nodes define how to access live knowledge without embedding it into the graph.

Define per connector:

domain and scope:
- e.g. Legal/EU/AIAct, AI/Research/arXiv
allowed tools and channels:
- search APIs
- internal data feeds
policy constraints:
- jurisdictions
- freshness windows
- allowlists/blocklists
- max documents and max context sizes
stability and ranking rules:
- how results are scored
- how they compete with internal Info nodes

Connector behavior is specified in:

Lifecycle and versioning follow:

ARG Core Lifecycle

5 Online routing decisions

5.1 Policy pre-check gates

Policy is the first hard gate for retrieval.

Policy pre-check covers:

scope
safety and content rules
permissions for user, domain, and external data
budget constraints

See:

5.2 Context Weaver outputs for retrieval routing

The Weaver produces taxonomy-coherent labels and confidence.

Key behaviors:

fast path in most queries
escalation path only when ambiguity persists
strict validator as the final arbiter

Expected outputs:

RetrievalType (USER / DOMAIN / EXTERNAL)
L_final label set
confidence_global
flags for uncertainty and missing coverage

See:

5.3 Binding to User and Domain Info nodes

Binding selects the read targets inside the graph (USER or DOMAIN scope) after the Context Weaver has produced a taxonomy-coherent label set L_final.

Core rules:

restrict the candidate space by scope + taxonomy attachments (set-theory first)
apply policy constraints before any ranking or traversal
score and traverse only inside the bounded, label-coherent subgraph
select top candidates using thresholds and deterministic tie-breakers
prefer lower-risk Info nodes when ambiguity remains

A minimal binding filter can be expressed as:

where is the policy/tenant-visible node set and is the node set attached to the taxonomy region implied by L_final.

Optional (stable if enabled):

community or similarity refiners (e.g., Louvain/Leiden partitions, Jaccard/SimRank) MAY be used to improve landing-point precision and reduce traversal, but MUST operate only on and MUST NOT override taxonomy or policy gates.

See ARG Core Step 4 Landing Point • ARG Core Step 5 Neighbor Scoring • ARG Core Step 7 Traversal.

5.4 External retrieval mode

External retrieval is used when the query requires fresh knowledge that is not safely captured in the graph.

When to use it:

no suitable internal Info node covers the query
the label region is explicitly marked as “external-driven”
risk remains low enough under policy constraints

How it works:

bind to an ExternalSource / LiveConnector node rather than an Info node
execute retrieval via the connector’s tools/APIs
build an external context bundle with:
- source
- timestamp
- excerpt
- score
- url/id

Hard rules:

no online creation or modification of nodes, edges, labels, or clusters
no direct promotion of external text into Info nodes

See:

5.5 Clarify instead of answer

Clarify is an explicit branch.
It triggers when answering would be unsafe, underspecified, or overconfident.

Trigger conditions:

low score or low margin during candidate selection
see ARG Core Step 5 Neighbor Scoring
policy restricts or blocks access to needed information
see PM 1 Pre-Check
see PM 2 constraints
conflicting labels or unstable taxonomy region
see OW 7 confidence and flags
external retrieval required but disallowed by policy

Clarification rules:

ask the minimum number of questions needed to select a safe retrieval path
keep questions scoped and policy-safe
avoid presenting high-risk or disallowed sources as defaults
when risk is high, default to non-answer or very conservative answer

See:

6 Retrieval execution and synchronization

This section explains how an agent runtime performs retrieval without breaking the online fixed-graph contract.

6.1 Pattern A – User and domain graph retrieval

Preconditions:

Policy decision allows reading from user and domain Info nodes
RetrievalType ∈ {USER, DOMAIN} and labels have been validated by the Weaver (L_final)

Strategy 1 (mandatory): taxonomy-first retrieval (set-theory + bounded traversal)

1) Restrict the search space (before any traversal)

build the eligible node set from scope + taxonomy attachments:

2) Landing point (Step 4)

select a start node inside that best matches L_final
output: landing_point_id

3) Neighbor scoring + traversal (Steps 5 + 7)

score neighbors using edge semantics + label/cluster orbit coherence
traverse a bounded number of hops (typically 2–3) within

4) Stop + Info selection (Step 9)

stop on coverage or budget (latency/tokens/hops)
select a bounded set of nodes and extract their chunks
output: InternalContextBundle (bounded, taxonomy-aligned)

Strategy 2 (optional): graph refiners (precision/perf boosts, no structure override)

These refiners MAY be applied only inside to improve landing-point precision and reduce traversal depth. They MUST NOT override taxonomy or policy gates.

Community / partition refiners

Louvain / Leiden: reduce the search region to a local community compatible with L_final

Neighborhood similarity refiners

Jaccard: fast overlap-based neighbor similarity
Adamic–Adar / Resource Allocation: Jaccard-like but less hub-biased

Structural similarity refiners

SimRank: similarity by “neighbors of neighbors” structure (higher cost; use selectively)

Note: exact subgraph isomorphism (e.g., VF2) is typically offline/debug-only. Online retrieval should prefer bounded neighborhoods, partitions, and similarity scoring.

Execution (common constraints):

query internal indexes/stores using:
- graph neighborhood inside
- labels/clusters (L_final)
- optional vector accelerators constrained by taxonomy (restrictor only)
fetch only:
- a bounded number of chunks
- within size limits

Outcome capture:

record which Info nodes were actually used
record basic metrics (latency, number of chunks, cache hit)

See ARG Core Step 4 Landing Point • ARG Core Step 5 Neighbor Scoring • ARG Core Step 7 Traversal • ARG Core Step 9 Info.

For the high-level request-time flow, see Figure 3 – Effective retrieval process in 2.1 Online request-time loop.

6.2 Pattern B – External retrieval via connectors

External retrieval is allowed only when Policy and Context Weaver decide it is needed and safe.

Bounded retrieval:

local reasoning is allowed only inside explicit connector constraints
see PM 2 constraints
tools and endpoints must be whitelisted
see Policy access control

Freshness:

use connector configs to enforce:
- max age of documents
- per-domain filters
- safety filters on content types

Sandboxing:

external calls must remain auditable in the same request envelope
see ARG Core Step 1 Policy Pre-Check

Hard rule:

no online creation or modification of graph structure from external content
see ARG Core online fixed graph contract

See:

External retrieval mode via ExternalSource node

6.3 Caching, freshness, and rate limits

Caching:

cache at the connector level, not inside the LLM
associate cache entries with:
- connector id
- normalized query signature
- label region
- time-to-live

Freshness:

TTLs must match domain needs:
- e.g. minutes for incident status
- days or weeks for research literature

Rate limits:

policy budgets must bound external calls
see PM 2 constraints

6.4 Secrets and access control

Secrets separation:

keep credentials out of prompts and Info chunks
use scoped credentials per connector and per tenant

Controlled escalation:

if higher privilege is required, use:
- explicit escalation flows
- or refusal, rather than silent elevation

See:

7 ARG RetrievalWrites and MemoryWrite Info

RetrievalWrites / MemoryWrite Info are governed episodic logs.
They capture retrieval behavior without mutating the online graph.

Recommended wording:

ARG MemoryWrite Info
ARG RetrievalWrite (if you specialize)

7.1 Recommended RetrievalWrite schema

Required fields:

trace id or request id
timestamp
actor or role when applicable
RetrievalType and labels
- see ARG Core Step 3 Context Weaver
Info node ids used (user + domain)
connector ids used (for EXTERNAL)
normalized query signature
external bundle metadata (source ids, timestamps, scores)
outcome status:
- answered
- partial
- abstain
standardized error codes, if any

Why this schema matters:

offline aggregation depends on stable fields
see ARG Core offline collection
retrieval patterns and coverage gaps are identified from these logs
see ARG Core offline candidates

7.2 Memory guard and PII

Memory protection:

filter, redact, or hash sensitive fields before writing
see Policy memory guard
enforce scope rules for stored artifacts

Retention and minimization:

keep only what is needed for:
- auditability
- offline refinement

Write path governance:

memory writes are guarded during the request
see ARG Core Step 10 MemoryWrite
see PM 3 Post-Check

7.3 Operational observability

Track at minimum:

retrieval success rate per label region
coverage gaps:
- queries frequently answered with low confidence or clarify
external retrieval rate and cost
connector-specific error and timeout rates

Offline consumption:

metrics feed offline candidate generation and scoring
see ARG Core offline candidates
see ARG Core offline scoring

8 Offline consolidation and knowledge evolution

This section matches the offline half of Figure 2.
Online retrieval produces episodic RetrievalWrites and external bundles.
Offline consolidation turns those episodes into controlled graph improvements.

8.1 Episode aggregation

Aggregate RetrievalWrites into stable groups before proposing any structural change.

Group by:

RetrievalType
validated labels
- see ARG Core Step 3 Context Weaver
Info node ids and connector ids
outcome and error codes
- see ARG Core offline outcomes

Detect coverage gaps:

frequent “no suitable Info node” bindings
repeated reliance on ExternalSource in the same region
high clarify or abstain rates for specific label regions

Aggregation inputs and logging requirements are defined in:

ARG Core offline collection

8.2 Taxonomy-guided candidates

Offline candidate generation must remain taxonomy-guided.

Candidate types:

new Info nodes for recurring queries
refinements of labels and edges
new or updated connectors for recurring external use
splits/merges when nodes are too broad or too granular

Candidate generation is defined in:

ARG Core offline candidates

Weaver offline role:

Context Weaver offline role

Taxonomy maintenance chain:

Context Weaver label maintenance

8.3 Scoring, policy checks, and publishing

Each candidate must be scored with strong vs weak evidence separation.

Replay and scoring:

replay-based evaluation where possible
penalties for complexity and noisy sources
separation of strong from weak signals
see ARG Core offline scoring

Validation and publishing gates:

scalable review bins
conditional auto-commit rules
see ARG Core offline validation

Policy and risk checks:

enforce policy constraints before promotion
verify no guardrail bypass is introduced
see Policy integration summary

Lifecycle and versioning:

promotion to ACTIVE
staged deprecation and removal
alias tables when needed
see ARG Core lifecycle

Publish to ARG Core:

versioned artifacts
refreshed indexes
stable online contract remains unchanged
see ARG Core offline loop

9 End-to-end examples

Each example maps to one branch in Figure 2.
Each includes inputs, outputs, decisions, and logs.

9.1 Pure user memory retrieval

Flow:

Policy pre-check
- see ARG Core Step 1 Policy Pre-Check
Context Weaver labels and RetrievalType=USER
- see ARG Core Step 3 Context Weaver
bind to UserMemory Info nodes
internal retrieval in user-specific subgraph
ARG-Info answer grounded in user memory
MemoryWrite Info if new stable facts discovered
- see ARG Core Step 10 MemoryWrite

9.2 Mixed user + domain retrieval

Flow:

Policy pre-check
Context Weaver labels and RetrievalType=DOMAIN (with user-specific labels)
bind to both:
- UserMemory Info nodes
- Domain/Project Info nodes
internal retrieval and context merge
ARG-Info answer grounded in both user and domain context
RetrievalWrite with node ids and labels used

9.3 External / fresh knowledge retrieval

Flow:

Policy pre-check authorizes external retrieval
Context Weaver labels and RetrievalType=EXTERNAL
bind to appropriate ExternalSource / LiveConnector node
external retrieval with policy-constrained tools
external bundle + any internal Info merged into context
grounded answer with explicit external sources
external bundle logged via MemoryWrite Info for offline analysis
- see External context bundle write rules

10 Production checklist and anti-patterns

10.1 Go-live checklist

Routing:

thresholds for binding vs clarify vs abstain
see ARG Core Step 5 Neighbor Scoring

Budgets and timeouts:

enforce hard budgets from policy constraints
see PM 2 constraints

Audit and redaction:

memory guard enforced
see Policy memory guard
MemoryWrite governance enforced
see ARG Core Step 10 MemoryWrite

Connector stability:

monitoring for:
- error spikes
- latency spikes
- cost anomalies

Rollback plan:

lifecycle and deprecation path defined for Info nodes and connectors
see ARG Core lifecycle

10.2 Anti-patterns

Never do these.

Create or modify nodes online from retrieved content:

hot path must treat the graph as fixed
see ARG Core online fixed graph contract

Let vector similarity decide truth:

vectors are candidates, not gates
policy and taxonomy validity must win
see PM 1 Pre-Check
see OW 5 strict validation

Treat external content as if it were internal structure:

do not push external snippets directly as Info nodes online
use offline consolidation and lifecycle instead

Over-fragment Info nodes:

avoid tiny Info fragments that drift and collide
prefer well-structured, stable chunks with clear labels

Log too much or log without governance:

PII leakage
missing retention controls
see Policy memory guard

11 Annexes

This section is implementation-facing.
It provides stable contracts and reference material.

11.1 JSON contracts

Contracts:

PolicyDecision
- see Policy inputs and outputs
WeaverOutput
- see Context Weaver inputs and outputs
BindResult for Info and ExternalSource nodes
- (same binding principles as actions)
- see ARG Core Step 4 Landing Point
ARG MemoryWrite Info / ARG RetrievalWrite
- see ARG Core Step 10 MemoryWrite

11.2 Standard flags

Uncertainty and safety flags:

Weaver flags
- see OW 7 confidence and flags
policy states and response modes
- see Policy inputs and outputs

11.3 Errors and codes

Retrieval errors:

standardize error codes and retry eligibility
aggregate in offline outcomes
- see ARG Core offline outcomes

11.4 Runtime synchronization patterns

Patterns:

runtime → Info node retrieval contract
runtime → ExternalSource / LiveConnector contract
caching and TTLs per connector
bounded external retrieval rules
- see External retrieval mode via ExternalSource node

Using ARG for retrieval ​

0 Objective of this guide ​

Audience ​

What you are building ​

What you will be able to do by the end ​

Out of scope ​

1 Essential concepts ​

1.1 Retrieval agents in ARG ​

1.2 Retrieval primitives ​

1.3 Vectors are approximators (for retrieval) ​

Concrete examples ​

2 Retrieval pipeline overview ​

2.1 Online request-time loop ​

2.2 Offline consolidation and knowledge evolution loop ​

3 Conceptual prerequisites ​

3.1 Define the retrieval scope ​

3.2 Define the knowledge taxonomy ​

3.3 Define Policy Manager governance for retrieval ​

4 Build the Retrieval Space ​

4.1 Info node model ​

4.2 Retrieval graph structure ​

4.3 ExternalSource / LiveConnector modeling ​

5 Online routing decisions ​

5.1 Policy pre-check gates ​

5.2 Context Weaver outputs for retrieval routing ​

5.3 Binding to User and Domain Info nodes ​

5.4 External retrieval mode ​

5.5 Clarify instead of answer ​

6 Retrieval execution and synchronization ​

6.1 Pattern A – User and domain graph retrieval ​

Strategy 1 (mandatory): taxonomy-first retrieval (set-theory + bounded traversal) ​

Strategy 2 (optional): graph refiners (precision/perf boosts, no structure override) ​

6.2 Pattern B – External retrieval via connectors ​

6.3 Caching, freshness, and rate limits ​

6.4 Secrets and access control ​

7 ARG RetrievalWrites and MemoryWrite Info ​

7.1 Recommended RetrievalWrite schema ​

7.2 Memory guard and PII ​

7.3 Operational observability ​

8 Offline consolidation and knowledge evolution ​

8.1 Episode aggregation ​

8.2 Taxonomy-guided candidates ​

8.3 Scoring, policy checks, and publishing ​

9 End-to-end examples ​

9.1 Pure user memory retrieval ​

9.2 Mixed user + domain retrieval ​

9.3 External / fresh knowledge retrieval ​

10 Production checklist and anti-patterns ​

10.1 Go-live checklist ​

10.2 Anti-patterns ​

11 Annexes ​

11.1 JSON contracts ​

11.2 Standard flags ​

11.3 Errors and codes ​

11.4 Runtime synchronization patterns ​

11.5 Related guides ​

Using ARG for retrieval

0 Objective of this guide

Audience

What you are building

What you will be able to do by the end

Out of scope

1 Essential concepts

1.1 Retrieval agents in ARG

1.2 Retrieval primitives

1.3 Vectors are approximators (for retrieval)

Concrete examples

2 Retrieval pipeline overview

2.1 Online request-time loop

2.2 Offline consolidation and knowledge evolution loop

3 Conceptual prerequisites

3.1 Define the retrieval scope

3.2 Define the knowledge taxonomy

3.3 Define Policy Manager governance for retrieval

4 Build the Retrieval Space

4.1 Info node model

4.2 Retrieval graph structure

4.3 ExternalSource / LiveConnector modeling

5 Online routing decisions

5.1 Policy pre-check gates

5.2 Context Weaver outputs for retrieval routing

5.3 Binding to User and Domain Info nodes

5.4 External retrieval mode

5.5 Clarify instead of answer

6 Retrieval execution and synchronization

6.1 Pattern A – User and domain graph retrieval

Strategy 1 (mandatory): taxonomy-first retrieval (set-theory + bounded traversal)

Strategy 2 (optional): graph refiners (precision/perf boosts, no structure override)

6.2 Pattern B – External retrieval via connectors

6.3 Caching, freshness, and rate limits

6.4 Secrets and access control

7 ARG RetrievalWrites and MemoryWrite Info

7.1 Recommended RetrievalWrite schema

7.2 Memory guard and PII

7.3 Operational observability

8 Offline consolidation and knowledge evolution

8.1 Episode aggregation

8.2 Taxonomy-guided candidates

8.3 Scoring, policy checks, and publishing

9 End-to-end examples

9.1 Pure user memory retrieval

9.2 Mixed user + domain retrieval

9.3 External / fresh knowledge retrieval

10 Production checklist and anti-patterns

10.1 Go-live checklist

10.2 Anti-patterns

11 Annexes

11.1 JSON contracts

11.2 Standard flags

11.3 Errors and codes

11.4 Runtime synchronization patterns

11.5 Related guides