Benchmark Guides#

This section provides end-to-end evaluation workflows organized by benchmark. Each guide covers environment setup, example configs, step-by-step commands, and advanced usage.

Benchmark	Description	Guide
RealWorld	Franka real-robot evaluation and deployment	Real-World Evaluation
BEHAVIOR-1K	Large-scale household scene simulation	BEHAVIOR-1K Evaluation
LIBERO	Robotic manipulation benchmark with Spatial / Object / Goal / Long / 90 suites	LIBERO Evaluation
ManiSkill OOD	ManiSkill out-of-distribution generalization evaluation	ManiSkill OOD Evaluation
PolaRiS	Tabletop manipulation simulation platform	PolaRiS Evaluation
RoboTwin	Bimanual manipulation simulation with multiple tasks	RoboTwin Evaluation

Note

Benchmarks such as IsaacLab and MetaWorld do not yet have example configs under evaluations/. For evaluation, refer to the corresponding training docs in RL with Embodied Simulators and use the config fallback mechanism with YAMLs under examples/embodiment/config/.