Architecture

System Overview

┌─────────────────────────────────────────────────────────────┐
│                        MCP Client                           │
│              (Claude Code / Robot Controller)                │
└──────────────────────────┬──────────────────────────────────┘
                           │ MCP Protocol (stdio)
┌──────────────────────────▼──────────────────────────────────┐
│                     MCP Server Layer                        │
│                                                             │
│  learn  recall  save_perception  forget  update  session    │
│    │       │         │             │       │       │        │
│    └───────┴─────────┴─────────────┴───────┴───────┘        │
│                           │                                 │
│                    ┌──────▼───────┐                          │
│                    │  Validators  │  Pydantic L1             │
│                    └──────┬───────┘                          │
│                           │                                 │
│    ┌──────────────────────┼──────────────────────┐          │
│    │                      │                      │          │
│    ▼                      ▼                      ▼          │
│ auto_classify          dedup              search.py         │
│ (category/tags/        (exact →           (BM25 + Vec       │
│  confidence/scope)      jaccard →          → RRF merge)     │
│                         cosine)                             │
│    └──────────────────────┼──────────────────────┘          │
│                           │                                 │
│                    ┌──────▼───────┐                          │
│                    │   ops layer  │                          │
│                    │  memories.py │  insert/update/touch     │
│                    │  sessions.py │  create/end/summarize    │
│                    │  search.py   │  fts_search/vec_search   │
│                    │  tags.py     │  add/remove/normalize    │
│                    └──────┬───────┘                          │
│                           │                                 │
│         ┌─────────────────┼─────────────────┐               │
│         ▼                 ▼                 ▼               │
│    ┌─────────┐     ┌───────────┐     ┌──────────┐          │
│    │ FTS5    │     │ memories  │     │ vec0     │          │
│    │ (BM25)  │     │ (SQLite)  │     │ (vector) │          │
│    └─────────┘     └───────────┘     └──────────┘          │
│                           │                                 │
│                    ┌──────▼───────┐                          │
│                    │  memory.db   │  ~/.robotmem/            │
│                    └──────────────┘                          │
└─────────────────────────────────────────────────────────────┘
                           │
              ┌────────────┼────────────┐
              ▼            ▼            ▼
         FastEmbed      Ollama     OpenAI-compat
         (ONNX)        (HTTP)      (HTTP)

Search Pipeline

The recall search pipeline combines two ranking signals through Reciprocal Rank Fusion:

Query: "how to grasp a cup"
         │
         ├──→ BM25 (FTS5)                    ──→ ranked list A
         │    tokenize → jieba (CJK) → FTS5 MATCH
         │    ORDER BY bm25() score
         │
         └──→ Vector (vec0)                   ──→ ranked list B
              embed_one(query) → float[384]
              WHERE embedding MATCH blob AND k=N
              ORDER BY cosine distance

         ├──→ RRF Merge (k=60)
         │    score(d) = Σ 1/(k + rank_i + 1) for each list
         │
         ├──→ Source Weighting
         │    real-world data × 1.5 boost
         │
         ├──→ Confidence Filter
         │    confidence >= min_confidence (default 0.3)
         │
         ├──→ context_filter (structured)
         │    dot-path matching on parsed context JSON
         │    operators: $lt, $lte, $gt, $gte, $ne, equality
         │
         ├──→ spatial_sort (nearest neighbor)
         │    Euclidean distance on coordinate arrays
         │    optional max_distance cutoff
         │
         ├──→ Top-K truncation
         │
         └──→ MaxScore normalization (best = 1.0)

RRF Formula

score(document) = Σ 1 / (k + rank + 1)

Where k=60 (configurable via rrf_k). Higher k gives more weight to documents that appear in multiple lists vs. their individual ranks.

Database Schema

memories (core table)

Column	Type	Description
`id`	INTEGER PK	Auto-increment ID
`session_id`	TEXT	Linked session (external_id)
`collection`	TEXT	Logical namespace
`type`	TEXT	`"fact"` or `"perception"`
`content`	TEXT	Memory text (max 300 chars)
`human_summary`	TEXT	Short summary (max 200 chars)
`context`	TEXT	JSON context (params/spatial/robot/task)
`perception_type`	TEXT	visual/tactile/auditory/proprioceptive/procedural
`perception_data`	BLOB	Raw sensor data
`perception_metadata`	TEXT	Format/units metadata
`category`	TEXT	Auto-classified category
`confidence`	REAL	0.0–1.0 (default 0.9)
`decay_rate`	REAL	Per-day decay (default 0.01)
`status`	TEXT	active / superseded / invalidated
`superseded_by`	INTEGER	ID of replacing memory
`content_hash`	TEXT	SHA-256 prefix for dedup
`embedding`	BLOB	Float vector (384d or 768d)
`access_count`	INTEGER	Recall hit counter
`return_count`	INTEGER	Times returned to user
`last_accessed`	TEXT	ISO timestamp of last recall hit
`created_at`	TEXT	ISO timestamp
`updated_at`	TEXT	ISO timestamp

sessions

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`external_id`	TEXT UNIQUE	UUID session identifier
`collection`	TEXT	Associated collection
`context`	TEXT	Session context JSON (max 64KB)
`session_count`	INTEGER	Reuse counter
`status`	TEXT	active / ended
`client_type`	TEXT	`"mcp_direct"`

session_outcomes

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`session_id`	TEXT	Session external_id
`score`	REAL	Episode success score (0.0–1.0)

memory_tags

Column	Type	Description
`memory_id`	INTEGER	FK to memories.id
`tag`	TEXT	Tag from controlled vocabulary
`source`	TEXT	`"auto"` or `"user"`

PK: (memory_id, tag)

tag_meta

Column	Type	Description
`tag`	TEXT PK	Tag identifier
`parent`	TEXT	Parent tag (NULL = root dimension)
`display_name`	TEXT	Human-readable name

Virtual Tables

Table	Engine	Purpose
`memories_fts`	FTS5	Full-text search (content, human_summary, scope_files, scope_entities)
`memories_vec`	vec0	Vector similarity search (float[384])

Indexes

idx_mem_collection    ON memories(collection)
idx_mem_status        ON memories(status)
idx_mem_session       ON memories(session_id)
idx_mem_type          ON memories(type)
idx_mem_hash          ON memories(content_hash) WHERE content_hash IS NOT NULL
idx_mem_no_embed      ON memories(collection) WHERE embedding IS NULL AND status='active'
idx_memory_tags_tag   ON memory_tags(tag)

Tag Taxonomy

The tag system uses a 9-dimension tree with 50+ tags:

metacognition          ← reasoning, cognitive_bias, decision_framework, ...
capability             ← build, debug, design, review, architecture, ...
domain                 ← cs_fundamentals, ai_ml, finance, ...
technique              ← patterns, anti_patterns, recipes, ...
timing                 ← when_to_start, when_to_stop, when_to_switch
boundary               ← tradeoff, constraint, not_applicable, ...
experience             ← war_story, postmortem, gotcha, root_cause, ...
self_defect            ← hallucination, sycophancy, overengineering, ...
reflection             ← accuracy_calibration, behavior_rule, preference, ...

Tags are auto-inferred by regex rules (auto_classify.py) and stored in memory_tags with source="auto".

Auto-Classification Pipeline

learn(insight="grip_force=12.5N works best because sensor was calibrated")
         │
         ├── classify_category()   → "root_cause" (matched "because")
         ├── estimate_confidence() → 0.90 (file path +0.05, causal +0.05)
         ├── extract_scope()       → {scope_files: [], scope_entities: ["grip_force"]}
         ├── classify_tags()       → ["root_cause", "observation"]
         └── build_context_json()  → merge user context + source marker

All classifiers are pure regex — no LLM dependency, sub-millisecond execution.

Consolidation Algorithm

end_session consolidation merges redundant memories:

1. Query consolidatable memories:
   - Same session + collection
   - status = active
   - category NOT IN (constraint, postmortem, gotcha)  ← protected
   - confidence < 0.95  ← high-confidence preserved
   - perception_type IS NULL  ← perceptions never consolidated

2. Skip if < 3 memories

3. Group by category

4. Within each group: pairwise Jaccard similarity
   - > 0.50 threshold → greedy clustering
   - Cluster constraint: ALL pairs within cluster must exceed threshold

5. Per cluster: select representative
   - Priority: confidence DESC → access_count DESC → created_at DESC

6. Non-representatives → status = 'superseded', superseded_by = representative.id

Resilience Patterns

Three-Layer Defense

Every MCP tool follows a consistent defense pattern:

Layer	Phase	Mechanism
L1	Before	Pydantic validation, type checking, range enforcement
L2	During	try-except per operation, graceful degradation, safe_db_transaction
L3	After	Structured response, logging, access counter updates

Safe DB Primitives

Primitive	Use	Behavior on Failure
`safe_db_write`	Single SQL write	Returns None (lock timeout / disk full / corrupt)
`safe_db_transaction`	Multi-SQL atomic batch	Returns (False, None) and rolls back
`mcp_error_boundary`	MCP tool decorator	Catches all exceptions, returns `{"error": "..."}`

Service Cooldown

When Ollama/external embedding service fails: - Exponential backoff: 60s → 120s → 240s → 300s (max) - During cooldown: embedder.available = False, search degrades to BM25-only - Success resets cooldown counter

Embedding Pipeline

                    ┌─────────────────────┐
config.embed_backend│                     │
         ┌──────────┤  create_embedder()  │
         │          │                     │
         │          └─────────────────────┘
         │
    ┌────▼──────┐         ┌───────────────┐
    │  "onnx"   │         │   "ollama"    │
    │           │         │               │
    │ FastEmbed │         │ OllamaEmbed   │
    │ ONNX CPU  │         │ HTTP API      │
    │ ~5ms/q    │         │ ~20-50ms/q    │
    │ 384d      │         │ 768d          │
    └───────────┘         └───────────────┘
         │                       │
         │    Embedder Protocol  │
         │    ├── embed_one()    │
         │    ├── embed_batch()  │
         │    ├── available      │
         │    └── close()        │
         └───────────┬───────────┘
                     │
              ┌──────▼───────┐
              │  float[dim]  │
              │  → vec0 BLOB │
              └──────────────┘

Web UI Architecture

python -m robotmem web
         │
    ┌────▼────────────────────────┐
    │  Flask App Factory          │
    │  create_app()               │
    │                             │
    │  ┌────────────────────────┐ │
    │  │  api_bp (Blueprint)    │ │
    │  │  /api/doctor           │ │
    │  │  /api/stats            │ │
    │  │  /api/memories         │ │
    │  │  /api/search           │ │
    │  │  /api/memory/<id>      │ │
    │  │  /api/sessions         │ │
    │  │  /api/collections      │ │
    │  │  /api/categories       │ │
    │  │  /api/recent-failures  │ │
    │  │  /api/sessions/<id>/   │ │
    │  │       memories         │ │
    │  └────────────────────────┘ │
    │                             │
    │  GET / → index.html         │
    │  CogDatabase (shared conn)  │
    └─────────────────────────────┘

REST API Endpoints

Method	Endpoint	Description
GET	`/api/doctor`	Health check: FTS5/vec0 sync, zero-hit rate, DB size
GET	`/api/stats`	Total counts, type/category distribution, collections
GET	`/api/memories`	Paginated list with filters (collection, type, category, confidence, days)
GET	`/api/search?q=`	FTS5 full-text search across collections
GET	`/api/memory/<id>`	Single memory detail
DELETE	`/api/memory/<id>`	Soft-delete (with reason)
PUT	`/api/memory/<id>`	Update memory fields
GET	`/api/sessions`	Paginated session list with memory counts
GET	`/api/collections`	Collection list with counts
GET	`/api/categories`	Category list with counts
GET	`/api/recent-failures`	Recent postmortem/gotcha memories
GET	`/api/sessions/<id>/memories`	Memories within a session (timeline)

← Previous MCP Server Core Concepts Next → Search Pipeline Deep Dive