5 Senses, One Database
Visual. Tactile. Auditory. Proprioceptive. Procedural.
Five types of perception. One API. One database.
Most robot systems store different perception types in different places — images in one folder, force readings in a CSV, joint angles in ROS bags, action sequences in JSON logs. When you need to recall "what happened when I grasped that cup," you're querying five different systems.
We put them all in one table.
The Problem: Fragmented Perception
A typical robot manipulation pipeline generates:
- Visual — camera images, bounding boxes, object detection results
- Tactile — force/torque sensor readings, contact events
- Auditory — motor sounds, collision noises, environmental audio
- Proprioceptive — joint angles, end-effector pose, velocity
- Procedural — action sequences, trajectories, skill parameters
Each modality typically gets its own storage system, its own query interface, and its own retrieval logic. Cross-modal queries — "find the force reading from the same grasp where I saw the red cup" — require manual joins across systems.
This is fragmentation. And it gets worse as you add sensors.
The Solution: One API for All Senses
In robotmem, every perception goes through the same function:
from robotmem import save_perception
# Visual — saw a red cup
save_perception("saw red cup at [120, 200]",
perception_type="visual",
data='{"bbox": [120, 200, 50, 50], "confidence": 0.94}')
# Tactile — felt contact force
save_perception("felt 12.5N contact force",
perception_type="tactile",
data='{"force_N": 12.5, "contact_area_mm2": 45}')
# Auditory — heard a click
save_perception("heard click during insertion",
perception_type="auditory",
data='{"frequency_hz": 440, "duration_ms": 12}')
# Proprioceptive — arm position
save_perception("arm at joint angles [0.1, 0.8, -0.3, 1.2, 0.0, -0.5, 0.2]",
perception_type="proprioceptive",
data='{"joint_angles": [0.1, 0.8, -0.3, 1.2, 0.0, -0.5, 0.2]}')
# Procedural — action sequence
save_perception("push then lift: approach → push → lift",
perception_type="procedural",
data='{"steps": ["approach", "push", "lift"], "duration_s": 3.2}')
All five go into the same memories table. Same schema. Same search index. Same API.
Why One Table Works
The key design decision: separate the description from the data.
| Field | Purpose | Searchable |
|---|---|---|
description | Human-readable text | BM25 + Vector |
perception_type | Modality label | Filter |
data | Raw sensor JSON | JSON path query |
context | Task/spatial metadata | Context filter + Spatial sort |
The description is always text — searchable by keywords and semantic similarity. The data field holds the raw perception in JSON format, typed by perception_type. This means you can search across modalities semantically ("what happened during grasping?") and then access the type-specific data from the results.
No schema changes needed for new sensor types. Add a LiDAR? Just use perception_type="lidar". The database doesn't care — it's a new label, not a new table.
Cross-Modal Recall
Because all perceptions share the same search index, you can query across modalities naturally:
from robotmem import recall
# Find all perceptions related to "grasp" — visual, tactile, procedural
result = recall("grasp red cup")
for m in result["memories"]:
print(f"[{m['perception_type']}] {m['content']}")
# Output:
# [visual] saw red cup at [120, 200]
# [tactile] felt 12.5N contact force
# [procedural] push then lift: approach → push → lift
One query returns the visual detection, the force reading, and the action sequence — all from the same grasp event. No joins. No cross-system queries.
Try It
pip install robotmem
python -c "
from robotmem import save_perception, recall
save_perception('felt 12.5N force', perception_type='tactile', data='{\"force\": 12.5}')
save_perception('saw red cup', perception_type='visual', data='{\"bbox\": [100,200,50,50]}')
result = recall('grasp')
for m in result['memories']:
print(f\"[{m['perception_type']}] {m['content']}\")
"
All Senses, One Memory
Visual, tactile, auditory, proprioceptive, procedural — unified.