Advanced Features#

This chapter provides a step-by-step deep dive into how RLinf achieves highly efficient execution, offering practical guidance to help you fully optimize your RL post-training workflows.

5D Parallelism Configuration
Explains how RLinf supports Megatron-style 5D parallelism, including: Tensor Parallelism (TP), Data Parallelism (DP), Pipeline Parallelism (PP), Sequence Parallelism (SP), and Context Parallelism (CP). Learn how to configure and combine these dimensions to scale large models efficiently.
LoRA Integration
Demonstrates how to integrate Low-Rank Adaptation (LoRA) into RLinf, enabling parameter-efficient fine-tuning for large-scale models with minimal compute overhead.
Switch SGLang Versions
Describes how to dynamically switch between different SGLang versions to accommodate varying compatibility needs or experimental requirements.
Checkpoint Resume
Covers how to resume training from saved checkpoints, ensuring fault tolerance and seamless continuation for long-running or interrupted training jobs.
Checkpoint Convertor
Describes how to convert a saved checkpoint file into HuggingFace safetensors format, which can be used for checkpoint evaluation or uploading to the HuggingFace Hub.
Heterogenous Software and Hardware Setup
Introduces how to configure and utilize heterogeneous software and hardware clusters, to fully leverage different types of computing resources and hardware devices.
Cloud-Edge Training Setup
Shows how to build a cloud-edge training setup with EasyTier, connect cloud and edge nodes into one overlay network, and run RLinf on top of that topology.
Training Visualisation
Introduces how to visualize and track key metrics during your training process. Currently, we support three backends for experiment tracking and visualization: TensorBoard, Weights & Biases (wandb), and SwanLab.