Benchmark Guides#
This section provides end-to-end evaluation workflows organized by benchmark. Each guide covers environment setup, example configs, step-by-step commands, and advanced usage.
Benchmark |
Description |
Guide |
|---|---|---|
RealWorld |
Franka real-robot evaluation and deployment |
|
BEHAVIOR-1K |
Large-scale household scene simulation |
|
LIBERO |
Robotic manipulation benchmark with Spatial / Object / Goal / Long / 90 suites |
|
ManiSkill OOD |
ManiSkill out-of-distribution generalization evaluation |
|
PolaRiS |
Tabletop manipulation simulation platform |
|
RoboTwin |
Bimanual manipulation simulation with multiple tasks |
Note
Benchmarks such as IsaacLab and MetaWorld do not yet have example configs under evaluations/. For evaluation, refer to the corresponding training docs in RL with Embodied Simulators and use the config fallback mechanism with YAMLs under examples/embodiment/config/.