System-level Optimizations#
RLinf’s overall design is simple and modular. Workers abstract components for RL and agents, with a flexible and efficient communication library enabling inter-component interaction. Thanks to this decoupled design, workers can be flexibly and dynamically scheduled to computing resources or assigned to the most suitable accelerators.
[Ongoing]Hot Scaling/Switching of Workers (Components)
Hot switching reduces training time by 50%+
[Ongoing]Hybrid Training on Heterogeneous Accelerator
Flexible inter-operability between components on different accelerators to build training workflows