Module 10: Simulation to Real

Introduction

Training in simulation is faster, safer, and cheaper than real-world training. But the "reality gap" between simulation and reality can cause policies to fail. This module covers techniques for successful sim-to-real transfer.

Section 1: The Reality Gap

1.1 Sources of Discrepancy

Reality Gap: The difference between simulated and real-world dynamics that causes policies trained in simulation to underperform or fail on physical systems.

Sources include:

Inaccurate physics parameters
Unmodeled friction and contact
Sensor noise differences
Actuator delays and dynamics

1.2 Quantifying the Gap

System identification and validation experiments.

Section 2: Domain Randomization

2.1 Randomizing Dynamics

def randomize_simulation():
    # Randomize physical properties
    sim.mass = nominal_mass * np.random.uniform(0.8, 1.2)
    sim.friction = np.random.uniform(0.3, 1.0)
    sim.motor_delay = np.random.uniform(0.01, 0.05)

    # Randomize visual properties
    sim.lighting = random_lighting()
    sim.textures = random_textures()

    return sim

2.2 Curriculum Learning

Progressively increasing randomization difficulty.

Too much randomization can make learning impossible. Start with narrow ranges and expand gradually.

Section 3: System Identification

3.1 Parameter Estimation

Fitting simulation parameters to real data:

$\boldsymbol{\theta}^* = \arg\min_{\boldsymbol{\theta}} \|\mathbf{y}_{real} - \mathbf{y}_{sim}(\boldsymbol{\theta})\|^2$

3.2 Model Validation

Cross-validation with held-out trajectories.

Section 4: Deployment

4.1 Safety Considerations

Gradual testing protocol
Emergency stops
Monitoring for anomalies

4.2 Online Adaptation

Fine-tuning with real-world data after initial deployment.

Summary

Key takeaways:

Reality gap causes simulation-trained policies to fail
Domain randomization builds robustness
System identification improves simulation accuracy
Careful deployment protocols ensure safety

Key Concepts

Reality Gap: Sim-real discrepancy
Domain Randomization: Training with varied parameters
System Identification: Estimating model parameters
Transfer Learning: Applying simulation knowledge to reality

Introduction​

Section 1: The Reality Gap​

1.1 Sources of Discrepancy​

1.2 Quantifying the Gap​

Section 2: Domain Randomization​

2.1 Randomizing Dynamics​

2.2 Curriculum Learning​

Section 3: System Identification​

3.1 Parameter Estimation​

3.2 Model Validation​

Section 4: Deployment​

4.1 Safety Considerations​

4.2 Online Adaptation​

Summary​

Key Concepts​

Further Reading​