Benchmark Guides#

This section provides end-to-end evaluation workflows organized by benchmark. Each guide covers environment setup, example configs, step-by-step commands, and advanced usage.

Benchmark

Description

Guide

RealWorld

Franka real-robot evaluation and deployment

Real-World Evaluation

BEHAVIOR-1K

Large-scale household scene simulation

BEHAVIOR-1K Evaluation

LIBERO

Robotic manipulation benchmark with Spatial / Object / Goal / Long / 90 suites

LIBERO Evaluation

ManiSkill OOD

ManiSkill out-of-distribution generalization evaluation

ManiSkill OOD Evaluation

PolaRiS

Tabletop manipulation simulation platform

PolaRiS Evaluation

RoboTwin

Bimanual manipulation simulation with multiple tasks

RoboTwin Evaluation

Note

Benchmarks such as IsaacLab and MetaWorld do not yet have example configs under evaluations/. For evaluation, refer to the corresponding training docs in RL with Embodied Simulators and use the config fallback mechanism with YAMLs under examples/embodiment/config/.