Real-World Robot Training Launch#
This page explains how to connect multiple Franka robots and a GPU training node to one Ray cluster, configure YAML, and start real-world RL training.
For full hardware setup, dependencies, data collection, and task details, see:
Real-World RL with Franka — standard RealSense + Franka gripper workflow
Using ZED + Robotiq with Franka — ZED camera + Robotiq gripper on split nodes
For RLINF_NODE_RANK, ray start, and code sync, see Multi-Node Ray Cluster Setup.
For node_groups / component_placement semantics, see Heterogenous Hardware Setup.
Typical topology#
Real-world training usually follows one GPU training node + N robot control nodes:
|
Role |
Notes |
|---|---|---|
0 (head) |
GPU train / rollout |
Runs |
1 … N |
Robot control |
Runs |
All nodes must be on the same LAN (or overlay network; see Cloud-Edge Training Setup), and
cluster.num_nodes must match the number of nodes joined to Ray.
Important
Control nodes need Franka dependencies (ROS, libfranka, etc.); see Real-World RL with Franka.
Ray freezes Python and env vars at
ray start; install dependencies on each node first.Use
ray_utils/realworld/setup_before_ray.shto align per-node env beforeray start.
Connect robot nodes to the Ray cluster#
Step 1: Prepare the environment on each node#
On every node, before ray start:
# Recommended: source and edit the repo script on each machine
source ray_utils/realworld/setup_before_ray.sh
export RLINF_NODE_RANK=<0..N-1> # unique; GPU node is usually 0
# If multiple NICs, pin the reachable interface, e.g.:
# export RLINF_COMM_NET_DEVICES=eth0
On control nodes, also source the ROS / Franka workspace if not in setup_before_ray.sh:
source <your_catkin_ws>/devel/setup.bash
Step 2: Start Ray#
Let <head_ip> be the GPU head’s reachable IP.
GPU node (rank 0, head):
export RLINF_NODE_RANK=0
ray start --head --port=6379 --node-ip-address=<head_ip>
Robot control nodes (rank 1, 2, …):
export RLINF_NODE_RANK=1 # use 2 for the second robot, etc.
ray start --address='<head_ip>:6379'
Run ray status on any node and confirm the node count matches cluster.num_nodes.
YAML configuration#
Real-world training is driven by the cluster section: node_groups separate GPU
and Franka hardware; component_placement maps actor / rollout / env to
those resources.
Before running, update:
robot_ip,target_ee_pose(task goal pose)actor.model.model_path(pretrained ResNet, etc.)algorithm.demo_buffer/data.path(for RLPD and other demo-based algorithms)
Single robot (1 GPU + 1 arm)#
See examples/embodiment/config/realworld_peginsertion_rlpd_cnn_async.yaml:
cluster:
num_nodes: 2
component_placement:
actor:
node_group: "4090"
placement: 0
env:
node_group: franka
placement: 0 # first robot in the franka group
rollout:
node_group: "4090"
placement: 0
node_groups:
- label: "4090"
node_ranks: 0 # GPU training node
- label: franka
node_ranks: 1 # robot control node
hardware:
type: Franka
configs:
- robot_ip: <ROBOT_IP>
node_rank: 1 # must match the control node's rank in node_ranks
If your GPU node is not rank 0 or the control node is not rank 1, update node_ranks
and hardware.configs[].node_rank together.
Multiple robots (1 GPU + 2 arms)#
See examples/embodiment/config/realworld_peginsertion_rlpd_cnn_async_2arms.yaml:
cluster:
num_nodes: 3
component_placement:
actor:
node_group: "4090"
placement: 0
env:
node_group: franka
placement: 0-1 # one env worker per robot
rollout:
node_group: "4090"
placement: 0:0-1 # two rollout processes on the same GPU
node_groups:
- label: "4090"
node_ranks: 0
- label: franka
node_ranks: 1-2
hardware:
type: Franka
configs:
- robot_ip: <ROBOT_IP_1>
node_rank: 1
- robot_ip: <ROBOT_IP_2>
node_rank: 2
Extend num_nodes, node_ranks, placement, and hardware.configs the same
way for more arms.
Split camera and arm (ZED + Robotiq)#
When the camera is on the GPU server and the arm and gripper are on a NUC,
set camera/gripper types and controller_node_rank in hardware.configs.
Field details and collection examples are in Using ZED + Robotiq with Franka.
For training, place env on the GPU node group (camera capture) and pin
FrankaController to the NUC:
cluster:
num_nodes: 2
component_placement:
actor:
node_group: gpu
placement: 0
env:
node_group: gpu
placement: 0
rollout:
node_group: gpu
placement: 0
node_groups:
- label: gpu
node_ranks: 0
- label: franka
node_ranks: 0-1
hardware:
type: Franka
configs:
- robot_ip: <ROBOT_IP>
node_rank: 0
camera_serials:
- "<ZED_SERIAL>"
camera_type: zed
gripper_type: robotiq
gripper_connection: "/dev/ttyUSB0"
controller_node_rank: 1 # controller on rank 1 (NUC)
Note
Install the ZED SDK on the GPU node into the same venv Ray uses, before
ray start. Configure Robotiq serial permissions on the NUC. See
Using ZED + Robotiq with Franka.
Auto-configuration from environment variables#
Per-machine hardware values (robot_ip, camera_serials,
gripper_connection, …) do not have to be hard-coded in YAML. Any
hardware.configs field left unset is auto-filled during enumeration from
an environment variable named after the field in upper case
(robot_ip → ROBOT_IP, camera_serials → CAMERA_SERIALS), which
keeps IPs and serial numbers out of committed configs.
Rules:
Values written in YAML always win; environment variables only fill unset fields.
The variable is read on the node that owns the config (its
node_rank), so export it there beforeray start(Ray freezes env vars at start).node_rankandcontroller_node_rankare never taken from the environment.
One robot per node. The whole value is used; list fields are split on commas:
- label: franka
node_ranks: 1
hardware:
type: Franka
configs:
- node_rank: 1 # robot_ip / cameras come from the environment
# on the rank-1 control node, before `ray start`
export ROBOT_IP=<ROBOT_IP>
export CAMERA_SERIALS=<serial_1>,<serial_2>
Several robots on one node. Give one comma-separated value per robot, assigned to the configs in order (so a list field takes one item per robot). The value count must match the number of robot configs on that node:
# two arms on one control node
export ROBOT_IP=<ip_arm0>,<ip_arm1>
export CAMERA_SERIALS=<serial_arm0>,<serial_arm1>
Create robots entirely from the environment. Leave configs empty and the
robots are created from the environment — the robot count is the comma count of
the identifier variable ROBOT_IP, and every field is read from its variable:
- label: franka
node_ranks: 1
hardware:
type: Franka
configs: [] # robots created from ROBOT_IP, CAMERA_SERIALS, …
export ROBOT_IP=<ip_arm0>,<ip_arm1> # two robots
export DISABLE_VALIDATE=true,true # one value per robot
In this mode every variable you set must provide the same number of
comma-separated values as ROBOT_IP; a mismatch aborts enumeration with an
explicit error. Other robot types use their own identifier field
(GimArm → CAN_INTERFACE, DOSW1 → ROBOT_URL).
Launch real-world training#
After the Ray cluster is up, YAML is updated, and (for RLPD) demo data is ready, on the
head node (usually rank 0 on the GPU machine) cd to the RLinf repo and run.
If the GPU and control nodes do not share the same RLinf checkout, run
export RLINF_CODE_WORKING_DIR=auto in the same terminal before the commands
below (see Multi-Node Ray Cluster Setup, Step 3: Enable code sync).
Single-arm training (peg insertion, RLPD + async SAC):
bash examples/embodiment/run_realworld_async.sh realworld_peginsertion_rlpd_cnn_async
Dual-arm parallel training:
bash examples/embodiment/run_realworld_async.sh realworld_peginsertion_rlpd_cnn_async_2arms
Other tasks (replace the config name as needed):
# Charger task
bash examples/embodiment/run_realworld_async.sh realworld_charger_sac_cnn_async
# Async PPO
bash examples/embodiment/run_realworld_async.sh realworld_peginsertion_async_ppo_cnn
<config_name> maps to examples/embodiment/config/<config_name>.yaml; you can pass a custom name.
Optional: dummy config to validate the cluster (single arm):
bash examples/embodiment/run_realworld_async.sh realworld_dummy_franka_sac_cnn
Verify cameras on control nodes and Ray/placement on the head with the dummy config before full training; see Real-World RL with Franka for the full checklist.