System-level Optimizations#

RLinf’s overall design is simple and modular. Workers abstract components for RL and agents, with a flexible and efficient communication library enabling inter-component interaction. Thanks to this decoupled design, workers can be flexibly and dynamically scheduled to computing resources or assigned to the most suitable accelerators.

[Ongoing]Hot Scaling/Switching of Workers (Components)
Hot switching reduces training time by 50%+

[Ongoing]Hybrid Training on Heterogeneous Accelerator
Flexible inter-operability between components on different accelerators to build training workflows