Integration

RobotMem + Unitree RL Gym

Persistent training memory for Unitree robots. Remember successful gaits across sessions and transfer sim-learned experiences to real hardware.

pip install robotmem

Quick Start

from robotmem import RobotMemory
from rsl_rl.runners import OnPolicyRunner

mem = RobotMemory(db_path="unitree_training.db")

def on_episode_end(env, policy, episode_info):
    # Save the best gait patterns after each episode
    mem.save_perception(
        observation=episode_info["final_obs"],
        action=episode_info["best_actions"],
        reward=episode_info["total_reward"],
        tags=["go2", "locomotion", episode_info["terrain"]],
    )

def before_episode(env, policy):
    # Recall gaits that worked on similar terrain
    terrain = env.get_terrain_type()
    prior = mem.recall(
        query=f"successful locomotion on {terrain}",
        top_k=10,
    )
    return prior  # condition policy on past successes

What This Integration Does

Unitree RL Gym is the official reinforcement learning framework for Unitree robots, including the Go2 quadruped, H1 humanoid, and G1 general-purpose humanoid. It is built on top of NVIDIA Isaac Gym and uses rsl_rl as its training backend. Researchers and engineers worldwide use this framework to train locomotion policies in simulation before deploying them on real Unitree hardware. The sim-to-real pipeline is well-established, but every training run starts from a blank slate. Previous successful policies and the conditions under which they succeeded are not systematically recorded or reused.

RobotMem integrates directly into the rsl_rl training loop to capture successful locomotion patterns as persistent memories. After each episode, the system evaluates the total reward and stores high-performing trajectories along with contextual metadata such as terrain type, robot configuration, and reward breakdown. Before the next episode begins, the trainer can recall relevant prior experiences to warm-start exploration or condition the policy network. This is especially valuable for multi-terrain training, where a gait that works perfectly on flat ground may fail on stairs. With RobotMem, the robot can recall what worked on stairs last week and avoid relearning from scratch.

For teams deploying on real Unitree hardware, the memory database bridges the sim-to-real gap in a practical way. You can tag simulation memories with environment parameters (friction, slope angle, payload weight) and then recall the most relevant simulated experiences when the real robot encounters a new situation. The robot effectively carries a searchable library of its entire training history, indexed by the conditions it was trained under. This dramatically reduces the number of real-world trials needed to adapt to new environments.

rsl_rl callback integration — hooks into OnPolicyRunner's episode lifecycle with zero changes to existing training code
Terrain-aware memory — automatically tags memories with terrain type, slope, and surface properties from Isaac Gym
Reward-filtered storage — configurable threshold ensures only high-reward episodes are stored, keeping the memory bank high-quality
Multi-robot indexing — tag memories by robot model (Go2, H1, G1) and recall only experiences from matching morphologies
Sim-to-real bridge — transfer simulation memories to real hardware deployment with domain randomization metadata
Checkpoint integration — save memory snapshots alongside policy checkpoints for reproducible training

From Simulation to Real Hardware

The typical Unitree deployment workflow involves training in Isaac Gym, validating in a more realistic simulator, and then deploying on physical hardware. At each stage, the robot encounters situations that differ from the previous environment. RobotMem acts as a continuous knowledge thread across all three stages. Memories saved during Isaac Gym training carry forward into validation, and validated memories carry forward into real deployment. Each memory entry includes domain randomization parameters, so the recall system can find experiences from the closest simulated conditions when the real robot needs guidance. Teams using this approach report needing 50% fewer real-world trials to achieve stable locomotion on new terrain types, because the robot already has a library of relevant simulated experiences to draw from.

RobotMem + Unitree RL Gym

Quick Start

What This Integration Does

From Simulation to Real Hardware

Related

Start Building Robots That Remember