HetCCL Accelerates LLM Training with Heterogeneous GPU Clusters

arXiv.org · February 02, 2026 · ✓ verified

Heehoon Kim et al. (paper authors) introduce HetCCL, a collective communication library that enables RDMA-based cross-vendor GPU communication without driver modifications.

  • HetCCL unifies vendor-specific backends and enables RDMA-based communication across NVIDIA and AMD GPUs while leveraging vendor libraries NCCL and RCCL; the library requires no modifications to existing deep learning applications or GPU drivers and is presented as a solution for multi-vendor GPU clusters.
  • Publication: arXiv submission arXiv:2601.22585 on 30 Jan 2026; Artifacts: PDF, HTML, TeX source and DOI link provided; License: CC BY-NC-ND 4.0. Evaluation: Experiments on a multi-vendor GPU cluster show HetCCL matches NCCL and RCCL performance in homogeneous setups and scales in heterogeneous environments.
Keep reading
Nordic data centers pioneer sustainable cooling and heat reuse atNorth · Jun 22 Data4 launches major European recruitment campaign for growth DATA4 Group · Jun 22 NVIDIA Rubin enables 45°C liquid-cooled AI data centers NVIDIA · Jun 22 Equinix trials hydrogen power units at Dublin data center Hydrogen Europe · Jun 19
Telborg · US Data Centers
Track the US data-center buildout — every day.

Real-time verified news and daily AI-written briefings, built from primary sources — power, grid, permits, land, financing. Start free.

Get Telborg Pro · $189/mo Get the daily briefing — free →