TetriServe: Efficient Step-level DiT Serving for Heterogeneous Images

arXiv.org · January 19, 2026 · ✓ verified

The authors (Runyu Lu et al.) introduce TetriServe, a DiT serving system that implements step-level sequence parallelism and a round-based scheduler to improve SLO attainment for heterogeneous image-generation workloads.

  • Main announcement: TetriServe implements step-level sequence parallelism and a round-based scheduling mechanism that (a) discretizes time into fixed rounds for tractable deadline-aware scheduling, (b) adapts parallelism at the step level to minimize GPU hour consumption, and (c) jointly packs requests to minimize late completions; evaluation shows up to 32% higher SLO attainment versus existing fixed-parallelism solutions without degrading image quality.
  • Background and details: The paper targets inefficiencies in existing serving systems that use fixed degree sequence parallelism, which perform poorly on heterogeneous workloads (mixed resolutions and deadlines). The authors evaluate TetriServe on state-of-the-art DiT models, report GPU-utilization and SLO metrics, and provide implementation details (round-based scheduler, step-level adaptation, packing policy).
Keep reading
Nordic data centers pioneer sustainable cooling and heat reuse atNorth · Jun 22 Data4 launches major European recruitment campaign for growth DATA4 Group · Jun 22 NVIDIA Rubin enables 45°C liquid-cooled AI data centers NVIDIA · Jun 22 Equinix trials hydrogen power units at Dublin data center Hydrogen Europe · Jun 19
Telborg · US Data Centers
Track the US data-center buildout — every day.

Real-time verified news and daily AI-written briefings, built from primary sources — power, grid, permits, land, financing. Start free.

Get Telborg Pro · $189/mo Get the daily briefing — free →