ESC
Deep Dive Search Pipeline

Search Architecture

robotmem's recall tool uses a hybrid search pipeline combining BM25 full-text search with vector similarity search, merged through Reciprocal Rank Fusion (RRF).

Pipeline Overview

recall(query="how to grasp a cup", n=5, min_confidence=0.3)
         │
         ├── L1: Input Validation
         │   query non-empty, top_k clamped to [1, 100]
         │   fetch_mul = 4× if filters active, else 2×
         │
    ┌────┴────────────────────────────┐
    │                                 │
    ▼                                 ▼
 BM25 Search                    Vector Search
 (FTS5 engine)                  (vec0 engine)
    │                                 │
    ├── tokenize_for_fts5()          ├── embedder.embed_one(query)
    │   CJK → jieba segmentation    │   → float[384] or float[768]
    │   Remove FTS5 operators        │
    │   Filter 1-char non-CJK       ├── vec_search_memories()
    │                                │   WHERE embedding MATCH blob
    ├── fts_search_memories()        │   AND k = fetch_limit
    │   JOIN memories ON rowid       │   ORDER BY cosine distance
    │   WHERE collection + active    │
    │   ORDER BY bm25() score        │
    │                                │
    ▼                                ▼
 ranked list A               ranked list B
    │                                │
    └────────────┬───────────────────┘
                 │
          ┌──────▼───────┐
          │  RRF Merge   │
          │  k=60        │
          └──────┬───────┘
                 │
          ┌──────▼───────────────┐
          │  Source Weighting     │
          │  real × 1.5          │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  Confidence Filter   │
          │  >= min_confidence   │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  Session Filter      │
          │  session_id match    │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  Context Extraction  │
          │  params/spatial/     │
          │  robot/task          │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  context_filter      │
          │  (structured)        │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  spatial_sort        │
          │  (nearest neighbor)  │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  Top-K Truncation    │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  MaxScore Normalize  │
          │  best = 1.0          │
          └──────┬───────────────┘
                 │
          ┌──────▼───────────────┐
          │  Touch (L3)          │
          │  access_count++      │
          └──────────────────────┘

BM25 Search (FTS5)

BM25 is a term-frequency ranking algorithm built into SQLite's FTS5 extension. robotmem uses it for keyword-level matching.

Tokenization

Before querying FTS5, the query text goes through tokenize_for_fts5():

  1. CJK Detection: If the text contains CJK characters (\u4e00-\u9fff), jieba segmentation is applied
  2. Syntax Cleaning: Remove FTS5 operators (AND, OR, NOT, NEAR) and special characters
  3. Token Filtering: Single-character non-CJK tokens are removed (too noisy)
  4. Query Construction: Remaining tokens are joined with OR and quoted: "token1" OR "token2"
# Example: "how to grasp a cup"
# → tokens: ["how", "to", "grasp", "cup"]
# → FTS5: '"how" OR "to" OR "grasp" OR "cup"'

# Example: "如何抓取杯子"
# → jieba: ["如何", "抓取", "杯子"]
# → FTS5: '"如何" OR "抓取" OR "杯子"'

FTS5 Query

SELECT m.id, m.content, ..., bm25(memories_fts) as bm25_score
FROM memories_fts
JOIN memories m ON m.id = memories_fts.rowid
WHERE memories_fts MATCH ?
  AND m.collection = ?
  AND m.status = 'active'
ORDER BY bm25(memories_fts)
LIMIT ?

The bm25() function returns negative scores (lower = more relevant), and results are ordered by this score.

FTS5 Indexed Fields

Field Source Purpose
content Memory text (jieba tokenized) Primary search target
human_summary Short summary (jieba tokenized) Additional search surface
scope_files File paths JSON Code file matching
scope_entities Entity names JSON Symbol matching

Vector Search (vec0)

Vector search uses sqlite-vec's vec0 virtual table for approximate nearest neighbor (ANN) search based on embedding similarity.

Embedding Flow

query text
    │
    ▼
embedder.embed_one(query)
    │
    ├── ONNX: FastEmbed local inference (~5ms)
    │   → float[384] (BAAI/bge-small-en-v1.5)
    │
    └── Ollama: HTTP API call (~20-50ms)
        → float[768] (nomic-embed-text)
    │
    ▼
floats_to_blob()
    │
    ▼
struct.pack('f' * dim, *embedding)
    → bytes (raw float32 array)

vec0 Query

SELECT m.id, m.content, ..., v.distance
FROM memories_vec v
JOIN memories m ON m.id = v.rowid
WHERE v.embedding MATCH ?    -- blob of float32 array
  AND m.collection = ?
  AND m.status = 'active'
  AND k = ?                  -- KNN parameter

The distance field represents cosine distance (0 = identical, 2 = opposite).

RRF Merge

Reciprocal Rank Fusion combines multiple ranked lists into a single ranking without requiring score normalization.

Formula

score(document) = Σ 1 / (k + rank_i + 1)

Where: - k = 60 (configurable via rrf_k in config) - rank_i = 0-based position in each list - Sum is over all lists that contain the document

Example

Document BM25 Rank Vec Rank RRF Score
doc_A 0 2 1/61 + 1/63 = 0.032
doc_B 1 0 1/62 + 1/61 = 0.033
doc_C 2 1/63 = 0.016
doc_D 1 1/62 = 0.016

doc_B ranks highest because it appears in both lists with good positions.

Why RRF?

Search Modes

The pipeline automatically selects the best mode based on available results:

Mode Condition Behavior
hybrid Both BM25 and Vec return results RRF merge of both lists
bm25_only Embedder unavailable or Vec returns empty BM25 results with synthetic RRF scores
vec_only BM25 returns empty (rare) Vec results with synthetic RRF scores

When running in degraded mode (bm25_only), synthetic RRF scores are assigned:

score = 1.0 / (60 + rank + 1)  # Same formula, single list

Source Weighting

Memories from real-world data receive a 1.5x boost over simulation data:

# Detection: context.env.sim_or_real == "real"
if parsed.get("env", {}).get("sim_or_real") == "real":
    m["_rrf_score"] = m["_rrf_score"] * 1.5

After weighting, results are re-sorted by _rrf_score descending.

Post-Merge Filters

Confidence Filter

if m.get("confidence", 0) < min_confidence:
    continue  # Skip low-confidence memories

Default min_confidence = 0.3. Memories that have decayed below this threshold are filtered out.

Session Filter

if session_id and m.get("session_id") != session_id:
    continue  # Only return memories from this episode

Used for episode replay: recall(query="*", session_id="abc-123").

Context Filter (Structured)

Dot-path matching on parsed context JSON:

# Equality
{"task.success": True}

# Range operators
{"params.force.value": {"$lt": 15.0}}
{"params.force.value": {"$gte": 10.0, "$lte": 20.0}}

# Combined (AND logic)
{"task.success": True, "robot.type": "UR5e"}

Implementation uses _resolve_dotpath() to traverse nested dicts:

# "task.success" → d["task"]["success"]
def _resolve_dotpath(d, path):
    current = d
    for key in path.split("."):
        if key not in current:
            return _MISSING
        current = current[key]
    return current

Supported operators:

Operator Meaning
(none) Exact equality
$lt Less than
$lte Less than or equal
$gt Greater than
$gte Greater than or equal
$ne Not equal

Type mismatches return False (no crash).

Spatial Sort (Nearest Neighbor)

Euclidean distance sorting on coordinate arrays:

spatial_sort = {
    "field": "spatial.object_position",
    "target": [1.3, 0.7, 0.42],
    "max_distance": 0.1  # optional cutoff
}

Distance formula:

distance = sqrt(Σ (actual_i - target_i)²)

MaxScore Normalization

After all filters, the top result's _rrf_score is set to 1.0, and all other scores are normalized relative to it:

max_score = filtered[0]["_rrf_score"]
for m in filtered:
    m["_rrf_score"] = m["_rrf_score"] / max_score

This ensures the best result always has a score of 1.0, making scores comparable across different queries.

Access Counting (L3)

After returning results, the pipeline updates access statistics for all returned memories:

batch_touch_memories(db.conn, hit_ids)
# Each memory: access_count++, return_count++, last_accessed = now

This serves two purposes: 1. Time decay protection: Frequently recalled memories maintain higher confidence 2. Analytics: Track which memories are most useful

Fetch Multiplier

When structured filters or spatial sorting are active, the pipeline fetches more candidates to compensate for filter losses:

Scenario Fetch Limit
No filters top_k × 2
context_filter or spatial_sort active top_k × 4

This ensures enough candidates survive filtering to fill the requested top_k results.

Three-Layer Defense

The search pipeline follows robotmem's consistent defense pattern:

Layer Phase Mechanism
L1 Before Query non-empty check, top_k clamping, FTS5 syntax cleaning
L2 During try-except around BM25 and Vec searches independently
L3 After batch_touch access counters, logging, structured response

Each search engine failing independently doesn't crash the other — the pipeline degrades gracefully.