Architecture
System Overview
┌─────────────────────────────────────────────────────────────┐
│ MCP Client │
│ (Claude Code / Robot Controller) │
└──────────────────────────┬──────────────────────────────────┘
│ MCP Protocol (stdio)
┌──────────────────────────▼──────────────────────────────────┐
│ MCP Server Layer │
│ │
│ learn recall save_perception forget update session │
│ │ │ │ │ │ │ │
│ └───────┴─────────┴─────────────┴───────┴───────┘ │
│ │ │
│ ┌──────▼───────┐ │
│ │ Validators │ Pydantic L1 │
│ └──────┬───────┘ │
│ │ │
│ ┌──────────────────────┼──────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ auto_classify dedup search.py │
│ (category/tags/ (exact → (BM25 + Vec │
│ confidence/scope) jaccard → → RRF merge) │
│ cosine) │
│ └──────────────────────┼──────────────────────┘ │
│ │ │
│ ┌──────▼───────┐ │
│ │ ops layer │ │
│ │ memories.py │ insert/update/touch │
│ │ sessions.py │ create/end/summarize │
│ │ search.py │ fts_search/vec_search │
│ │ tags.py │ add/remove/normalize │
│ └──────┬───────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌───────────┐ ┌──────────┐ │
│ │ FTS5 │ │ memories │ │ vec0 │ │
│ │ (BM25) │ │ (SQLite) │ │ (vector) │ │
│ └─────────┘ └───────────┘ └──────────┘ │
│ │ │
│ ┌──────▼───────┐ │
│ │ memory.db │ ~/.robotmem/ │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
FastEmbed Ollama OpenAI-compat
(ONNX) (HTTP) (HTTP)
Search Pipeline
The recall search pipeline combines two ranking signals through Reciprocal Rank Fusion:
Query: "how to grasp a cup"
│
├──→ BM25 (FTS5) ──→ ranked list A
│ tokenize → jieba (CJK) → FTS5 MATCH
│ ORDER BY bm25() score
│
└──→ Vector (vec0) ──→ ranked list B
embed_one(query) → float[384]
WHERE embedding MATCH blob AND k=N
ORDER BY cosine distance
├──→ RRF Merge (k=60)
│ score(d) = Σ 1/(k + rank_i + 1) for each list
│
├──→ Source Weighting
│ real-world data × 1.5 boost
│
├──→ Confidence Filter
│ confidence >= min_confidence (default 0.3)
│
├──→ context_filter (structured)
│ dot-path matching on parsed context JSON
│ operators: $lt, $lte, $gt, $gte, $ne, equality
│
├──→ spatial_sort (nearest neighbor)
│ Euclidean distance on coordinate arrays
│ optional max_distance cutoff
│
├──→ Top-K truncation
│
└──→ MaxScore normalization (best = 1.0)
RRF Formula
score(document) = Σ 1 / (k + rank + 1)
Where k=60 (configurable via rrf_k). Higher k gives more weight to documents that appear in multiple lists vs. their individual ranks.
Database Schema
memories (core table)
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment ID |
session_id |
TEXT | Linked session (external_id) |
collection |
TEXT | Logical namespace |
type |
TEXT | "fact" or "perception" |
content |
TEXT | Memory text (max 300 chars) |
human_summary |
TEXT | Short summary (max 200 chars) |
context |
TEXT | JSON context (params/spatial/robot/task) |
perception_type |
TEXT | visual/tactile/auditory/proprioceptive/procedural |
perception_data |
BLOB | Raw sensor data |
perception_metadata |
TEXT | Format/units metadata |
category |
TEXT | Auto-classified category |
confidence |
REAL | 0.0–1.0 (default 0.9) |
decay_rate |
REAL | Per-day decay (default 0.01) |
status |
TEXT | active / superseded / invalidated |
superseded_by |
INTEGER | ID of replacing memory |
content_hash |
TEXT | SHA-256 prefix for dedup |
embedding |
BLOB | Float vector (384d or 768d) |
access_count |
INTEGER | Recall hit counter |
return_count |
INTEGER | Times returned to user |
last_accessed |
TEXT | ISO timestamp of last recall hit |
created_at |
TEXT | ISO timestamp |
updated_at |
TEXT | ISO timestamp |
sessions
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
external_id |
TEXT UNIQUE | UUID session identifier |
collection |
TEXT | Associated collection |
context |
TEXT | Session context JSON (max 64KB) |
session_count |
INTEGER | Reuse counter |
status |
TEXT | active / ended |
client_type |
TEXT | "mcp_direct" |
session_outcomes
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
session_id |
TEXT | Session external_id |
score |
REAL | Episode success score (0.0–1.0) |
memory_tags
| Column | Type | Description |
|---|---|---|
memory_id |
INTEGER | FK to memories.id |
tag |
TEXT | Tag from controlled vocabulary |
source |
TEXT | "auto" or "user" |
PK: (memory_id, tag)
tag_meta
| Column | Type | Description |
|---|---|---|
tag |
TEXT PK | Tag identifier |
parent |
TEXT | Parent tag (NULL = root dimension) |
display_name |
TEXT | Human-readable name |
Virtual Tables
| Table | Engine | Purpose |
|---|---|---|
memories_fts |
FTS5 | Full-text search (content, human_summary, scope_files, scope_entities) |
memories_vec |
vec0 | Vector similarity search (float[384]) |
Indexes
idx_mem_collection ON memories(collection)
idx_mem_status ON memories(status)
idx_mem_session ON memories(session_id)
idx_mem_type ON memories(type)
idx_mem_hash ON memories(content_hash) WHERE content_hash IS NOT NULL
idx_mem_no_embed ON memories(collection) WHERE embedding IS NULL AND status='active'
idx_memory_tags_tag ON memory_tags(tag)
Tag Taxonomy
The tag system uses a 9-dimension tree with 50+ tags:
metacognition ← reasoning, cognitive_bias, decision_framework, ...
capability ← build, debug, design, review, architecture, ...
domain ← cs_fundamentals, ai_ml, finance, ...
technique ← patterns, anti_patterns, recipes, ...
timing ← when_to_start, when_to_stop, when_to_switch
boundary ← tradeoff, constraint, not_applicable, ...
experience ← war_story, postmortem, gotcha, root_cause, ...
self_defect ← hallucination, sycophancy, overengineering, ...
reflection ← accuracy_calibration, behavior_rule, preference, ...
Tags are auto-inferred by regex rules (auto_classify.py) and stored in memory_tags with source="auto".
Auto-Classification Pipeline
learn(insight="grip_force=12.5N works best because sensor was calibrated")
│
├── classify_category() → "root_cause" (matched "because")
├── estimate_confidence() → 0.90 (file path +0.05, causal +0.05)
├── extract_scope() → {scope_files: [], scope_entities: ["grip_force"]}
├── classify_tags() → ["root_cause", "observation"]
└── build_context_json() → merge user context + source marker
All classifiers are pure regex — no LLM dependency, sub-millisecond execution.
Consolidation Algorithm
end_session consolidation merges redundant memories:
1. Query consolidatable memories:
- Same session + collection
- status = active
- category NOT IN (constraint, postmortem, gotcha) ← protected
- confidence < 0.95 ← high-confidence preserved
- perception_type IS NULL ← perceptions never consolidated
2. Skip if < 3 memories
3. Group by category
4. Within each group: pairwise Jaccard similarity
- > 0.50 threshold → greedy clustering
- Cluster constraint: ALL pairs within cluster must exceed threshold
5. Per cluster: select representative
- Priority: confidence DESC → access_count DESC → created_at DESC
6. Non-representatives → status = 'superseded', superseded_by = representative.id
Resilience Patterns
Three-Layer Defense
Every MCP tool follows a consistent defense pattern:
| Layer | Phase | Mechanism |
|---|---|---|
| L1 | Before | Pydantic validation, type checking, range enforcement |
| L2 | During | try-except per operation, graceful degradation, safe_db_transaction |
| L3 | After | Structured response, logging, access counter updates |
Safe DB Primitives
| Primitive | Use | Behavior on Failure |
|---|---|---|
safe_db_write |
Single SQL write | Returns None (lock timeout / disk full / corrupt) |
safe_db_transaction |
Multi-SQL atomic batch | Returns (False, None) and rolls back |
mcp_error_boundary |
MCP tool decorator | Catches all exceptions, returns {"error": "..."} |
Service Cooldown
When Ollama/external embedding service fails:
- Exponential backoff: 60s → 120s → 240s → 300s (max)
- During cooldown: embedder.available = False, search degrades to BM25-only
- Success resets cooldown counter
Embedding Pipeline
┌─────────────────────┐
config.embed_backend│ │
┌──────────┤ create_embedder() │
│ │ │
│ └─────────────────────┘
│
┌────▼──────┐ ┌───────────────┐
│ "onnx" │ │ "ollama" │
│ │ │ │
│ FastEmbed │ │ OllamaEmbed │
│ ONNX CPU │ │ HTTP API │
│ ~5ms/q │ │ ~20-50ms/q │
│ 384d │ │ 768d │
└───────────┘ └───────────────┘
│ │
│ Embedder Protocol │
│ ├── embed_one() │
│ ├── embed_batch() │
│ ├── available │
│ └── close() │
└───────────┬───────────┘
│
┌──────▼───────┐
│ float[dim] │
│ → vec0 BLOB │
└──────────────┘
Web UI Architecture
python -m robotmem web
│
┌────▼────────────────────────┐
│ Flask App Factory │
│ create_app() │
│ │
│ ┌────────────────────────┐ │
│ │ api_bp (Blueprint) │ │
│ │ /api/doctor │ │
│ │ /api/stats │ │
│ │ /api/memories │ │
│ │ /api/search │ │
│ │ /api/memory/<id> │ │
│ │ /api/sessions │ │
│ │ /api/collections │ │
│ │ /api/categories │ │
│ │ /api/recent-failures │ │
│ │ /api/sessions/<id>/ │ │
│ │ memories │ │
│ └────────────────────────┘ │
│ │
│ GET / → index.html │
│ CogDatabase (shared conn) │
└─────────────────────────────┘
REST API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/doctor |
Health check: FTS5/vec0 sync, zero-hit rate, DB size |
| GET | /api/stats |
Total counts, type/category distribution, collections |
| GET | /api/memories |
Paginated list with filters (collection, type, category, confidence, days) |
| GET | /api/search?q= |
FTS5 full-text search across collections |
| GET | /api/memory/<id> |
Single memory detail |
| DELETE | /api/memory/<id> |
Soft-delete (with reason) |
| PUT | /api/memory/<id> |
Update memory fields |
| GET | /api/sessions |
Paginated session list with memory counts |
| GET | /api/collections |
Collection list with counts |
| GET | /api/categories |
Category list with counts |
| GET | /api/recent-failures |
Recent postmortem/gotcha memories |
| GET | /api/sessions/<id>/memories |
Memories within a session (timeline) |