Advanced Features#
This chapter provides a step-by-step deep dive into how RLinf achieves highly efficient execution, offering practical guidance to help you fully optimize your RL post-training workflows.
- 5D Parallelism Configuration
Explains how RLinf supports Megatron-style 5D parallelism, including: Tensor Parallelism (TP), Data Parallelism (DP), Pipeline Parallelism (PP), Sequence Parallelism (SP), and Context Parallelism (CP). Learn how to configure and combine these dimensions to scale large models efficiently.
- LoRA Integration
Demonstrates how to integrate Low-Rank Adaptation (LoRA) into RLinf, enabling parameter-efficient fine-tuning for large-scale models with minimal compute overhead.
- Switch SGLang Versions
Describes how to dynamically switch between different SGLang versions to accommodate varying compatibility needs or experimental requirements.
- Checkpoint Resume
Covers how to resume training from saved checkpoints, ensuring fault tolerance and seamless continuation for long-running or interrupted training jobs.
- Checkpoint Convertor
Describes how to convert a saved checkpoint file into HuggingFace safetensors format, which can be used for checkpoint evaluation or uploading to the HuggingFace Hub.
- Heterogenous Software and Hardware Setup
Introduces how to configure and utilize heterogeneous software and hardware clusters, to fully leverage different types of computing resources and hardware devices.
- Cloud-Edge Training Setup
Shows how to build a cloud-edge training setup with EasyTier, connect cloud and edge nodes into one overlay network, and run RLinf on top of that topology.
- Training Visualisation
Introduces how to visualize and track key metrics during your training process. Currently, we support three backends for experiment tracking and visualization: TensorBoard, Weights & Biases (wandb), and SwanLab.