RL with Embodied Simulators#

This category groups examples in which the simulator (or benchmark) is the headline. They show how to bring up RLinf on a specific simulation platform — environment installation, asset paths, observation/action spaces, and a reference RL recipe (typically PPO or GRPO with a VLA policy).

If you are starting from “I want to train on benchmark X”, this is the right entry point. For model-centric examples (pi₀, GR00T, …) see RL on Embodied Models. For real-robot setups, including Franka, see RL with Real-World Robots. For LIBERO setup on AMD ROCm or Ascend CANN accelerators, see the Supported Accelerators tutorial.

RL with ManiSkill Benchmark
ManiSkill + OpenVLA + PPO/GRPO achieves SOTA performance

RL with LIBERO Benchmarks
OpenVLA-OFT + PPO/GRPO on LIBERO (99% success) and on the harder LIBERO-Pro / LIBERO-Plus suites

RL with Behavior Benchmark
Support BEHAVIOR + OpenVLA-OFT / π₀ / π₀.₅ + PPO training

RL with MetaWorld Benchmark
Support MetaWorld+π₀/π₀.₅+PPO/GRPO training

RL with IsaacLab Benchmark
Support IsaacLab+gr00t+PPO training

RL with CALVIN Benchmark
Support CALVIN+π₀/π₀.₅+PPO/GRPO training

RL with RoboCasa Benchmark
Support RoboCasa+π₀+GRPO training

RL with RoboTwin Benchmark
Supports RoboTwin + OpenVLA-OFT / π₀ / π₀.₅ + PPO / GRPO training

RL with RoboVerse Benchmark
Support RoboVerse + π₀.₅ + PPO training

RL with Franka-Sim Benchmark
Supports Franka-Sim + MLP/CNN + PPO/SAC training

RL with EmbodiChain
MLP + PPO on EmbodiChain gym tasks

RL with PolaRiS Benchmark
PolaRiS + OpenPI + PPO training

RL with GSEnv for Real2Sim2Real
Support GSEnv + π₀.₅ + PPO training

RL with Genesis Benchmark
MLP policy training on the Genesis simulation platform