RL with RoboCasa Benchmark#

https://raw.githubusercontent.com/RLinf/misc/main/pic/robocasa.jpeg — RoboCasa (image: RoboCasa).#

RoboCasa is a robosuite-based kitchen manipulation benchmark with diverse layouts, objects, and atomic tasks. You’ll use RLinf to PPO-fine-tune an OpenPI π₀ policy on the RoboCasa CloseDrawer task.

Overview#

Fine-tune OpenPI π₀ on a mobile-manipulation kitchen task in RoboCasa.

Models

π₀

Algorithms

PPO

Tasks

CloseDrawer

Hardware

1 node · 8 GPUs

You’ll do: install → download kitchen assets + model → launch run_embodiment.sh → watch env/success_once.

Prerequisites: Installation · RoboCasa kitchen assets · an SFT checkpoint.

Tasks#

Task	Description
`CloseDrawer`	Close a kitchen drawer with the PandaOmron mobile manipulator.

Observation and Action#

Field	Specification
Observation	Two RGB views by default (`base_image` and `wrist_image` at 224×224) plus a 25-dim proprioceptive state.
Action	12-dim continuous action: arm position delta, arm rotation delta, gripper, base control, and mode selection.
Reward	Sparse task-completion reward.
Prompt	Natural-language instruction generated by the RoboCasa task.

Note

RoboCasa includes more atomic tasks, but the public RLinf recipe currently targets CloseDrawer with examples/embodiment/config/robocasa_closedrawer_ppo_openpi.yaml.

Installation#

First, clone the RLinf repository:

# Mainland China users can use a mirror for faster cloning:
# git clone https://ghfast.top/github.com/RLinf/RLinf.git
git clone https://github.com/RLinf/RLinf.git
cd RLinf

Then set up the dependencies with one of the two methods below — a prebuilt Docker image (recommended) or a custom environment. The general setup (prerequisites, GPU drivers, the in-image switch_env helper, mirrors, and troubleshooting) is documented once in Installation; the commands in this recipe only differ in the Docker image tag and the --env value.

Docker image

docker run -it --rm --gpus all \
   --shm-size 32g \
   --network host \
   --name rlinf \
   -v .:/workspace/RLinf \
   rlinf/rlinf:agentic-rlinf0.3-robocasa

# For mainland China users:
# docker.1ms.run/rlinf/rlinf:agentic-rlinf0.3-robocasa

Switch to the OpenPI virtual environment inside the image:

source switch_env openpi

Custom environment

Install RoboCasa with the OpenPI dependencies:

# Mainland China users can add --use-mirror.
bash requirements/install.sh embodied --model openpi --env robocasa
source .venv/bin/activate

Download the kitchen assets after installing RoboCasa:

python -m robocasa.scripts.download_kitchen_assets

Warning

The RoboCasa kitchen assets are about 5 GB. Download them once before launching training.

Download the Model#

Download the OpenPI π₀ checkpoint:

cd /path/to/save/model

git lfs install
git clone https://huggingface.co/RLinf/RLinf-Pi0-RoboCasa

# Or use huggingface-hub:
# export HF_ENDPOINT=https://hf-mirror.com
pip install huggingface-hub
hf download RLinf/RLinf-Pi0-RoboCasa --local-dir RLinf-Pi0-RoboCasa

After downloading, point your config YAML at the checkpoint — set the same path for both the rollout and the actor model:

rollout:
   model:
      model_path: /path/to/downloaded-checkpoint
actor:
   model:
      model_path: /path/to/downloaded-checkpoint

Run It#

Launch the CloseDrawer recipe:

Recipe	Config	Command suffix
OpenPI π₀ + PPO	`examples/embodiment/config/robocasa_closedrawer_ppo_openpi.yaml`	`robocasa_closedrawer_ppo_openpi`

bash examples/embodiment/run_embodiment.sh robocasa_closedrawer_ppo_openpi

What this does:

Starts the embodied training entrypoint with the RoboCasa Hydra config.
Creates Ray workers for the actor, rollout, and RoboCasa env components.
Runs PPO rollouts, computes sparse task rewards, and updates the OpenPI policy.

For standalone evaluation, use the unified Evaluation CLI with config fallback and the same suffix, robocasa_closedrawer_ppo_openpi.

Note

The default config uses actor,env,rollout: all. Tune env.train.total_num_envs, env.eval.total_num_envs, and actor.global_batch_size for your GPU memory budget.

Visualization and Results#

Launch TensorBoard from the RLinf repo root:

tensorboard --logdir ../results --port 6006

The key signal is env/success_once. For every logged metric, see Training metrics.

Enable video in the env config when you want rollout videos:

video_cfg:
  save_video: True
  info_on_video: True
  video_base_dir: ${runner.logger.log_path}/video/train

Enable W&B or SwanLab by adding logger backends:

runner:
  logger:
    logger_backends: ["tensorboard", "wandb"]  # or swanlab

Note

This page does not publish a fixed RoboCasa success-rate table yet. Use env/success_once and evaluation videos to compare your runs.