Advanced Features#

This chapter provides a step-by-step deep dive into how RLinf achieves highly efficient execution, offering practical guidance to help you fully optimize your RL post-training workflows.

  • 5D Parallelism Configuration

    Explains how RLinf supports Megatron-style 5D parallelism, including: Tensor Parallelism (TP), Data Parallelism (DP), Pipeline Parallelism (PP), Sequence Parallelism (SP), and Context Parallelism (CP). Learn how to configure and combine these dimensions to scale large models efficiently.

  • LoRA Integration

    Demonstrates how to integrate Low-Rank Adaptation (LoRA) into RLinf, enabling parameter-efficient fine-tuning for large-scale models with minimal compute overhead.

  • Switch SGLang Versions

    Describes how to dynamically switch between different SGLang versions to accommodate varying compatibility needs or experimental requirements.

  • Checkpoint Resume

    Covers how to resume training from saved checkpoints, ensuring fault tolerance and seamless continuation for long-running or interrupted training jobs.

  • Checkpoint Convertor

    Describes how to convert a saved checkpoint file into HuggingFace safetensors format, which can be used for checkpoint evaluation or uploading to the HuggingFace Hub.

  • Heterogenous Software and Hardware Setup

    Introduces how to configure and utilize heterogeneous software and hardware clusters, to fully leverage different types of computing resources and hardware devices.

  • Cloud-Edge Training Setup

    Shows how to build a cloud-edge training setup with EasyTier, connect cloud and edge nodes into one overlay network, and run RLinf on top of that topology.

  • Training Visualisation

    Introduces how to visualize and track key metrics during your training process. Currently, we support three backends for experiment tracking and visualization: TensorBoard, Weights & Biases (wandb), and SwanLab.