Quickstart#

Welcome to the RLinf Quickstart Guide. This section will walk you through launching RLinf for the first time. We present three concise examples to demonstrate the framework’s workflow and help you get started quickly.

SOTA RL Training Reproduction#

RLinf provides end-to-end recipes that reproduce or match state-of-the-art (SOTA) RL results out of the box—users can directly run our configs and scripts to obtain published numbers without custom engineering.

For embodied tasks, RLinf reaches or matches SOTA success rates on benchmarks such as LIBERO, ManiSkill, RoboTwin, and more with OpenVLA, OpenVLA-OFT, π₀/π₀.₅, GR00T and other VLAs (see the Embodied Scenarios gallery and Supported RL Algorithms for algorithm details).

For agentic tasks (including math reasoning), RLinf achieves SOTA performance on AIME24/AIME25/GPQA-diamond benchmarks with DeepSeek-R1-Distill-Qwen models, and supports single-agent and multi-agent training tasks such as Search-R1 and Coding-Online-RL (see Agentic Scenarios).