RollArc: Scaling Agentic RL Training via Disaggregated Infrastructure

arXiv.org · December 30, 2025 · ✓ verified

Wei Gao and co-authors present RollArc, a distributed system to speed up agentic RL training by disaggregating workloads onto specialized hardware.

  • Main announcement: The paper introduces RollArc, built on hardware-affinity workload mapping, fine-grained asynchrony, and statefulness-aware computation to route compute-bound and bandwidth-bound tasks to best-fit GPUs, mitigate resource bubbles at the trajectory level, and offload stateless components (e.g., reward models) to serverless infrastructure. The authors report 1.35–2.05× end-to-end training time reduction compared to monolithic/synchronous baselines.
  • Background and evaluation details: The system is evaluated by training a hundreds-of-billions-parameter MoE model for the Qoder product on an Alibaba cluster with more than 3,000 GPUs; the paper is 17 pages with 17 figures, submitted to arXiv on 27 Dec 2025 (arXiv:2512.22560). Code available:https://github.com/alibaba/ROLL.
Keep reading
JUPITER exascale powers brain mapping, climate, 6G and quantum NVIDIA · Jun 22 NAIRR pilot accelerates scientific AI research with NVIDIA DGX NVIDIA · Jun 22 Eco Wave Power Uses NVIDIA AI To Harness Wave Energy NVIDIA · Jun 22 Nordic data centers pioneer sustainable cooling and heat reuse atNorth · Jun 22
Telborg · US Data Centers
Track the US data-center buildout — every day.

Real-time verified news and daily AI-written briefings, built from primary sources — power, grid, permits, land, financing. Start free.

Get Telborg Pro · $189/mo Get the daily briefing — free →