RepetitionCurse: Router Imbalance in MoE LLMs under DoS Stress

arXiv.org · January 01, 2026 · ✓ verified

The paper by Ruixuan Huang et al. (arXiv:2512.23995, submitted 30 Dec 2025) demonstrates a model-agnostic denial-of-service vulnerability in Mixture-of-Experts (MoE) inference routing using a technique called RepetitionCurse.

  • Main announcement: The authors present RepetitionCurse, a low-cost black-box attack that uses simple repetitive token patterns to force routing concentration in MoE models; on Mixtral-8x7B the attack increases end-to-end inference latency by 3.063x, causing computational bottlenecks on some devices and idle resources on others and leading to violations of service-level agreements for time to first token.
  • Background and details: The paper documents that out-of-distribution prompts can cause tokens to be routed to the same top-k experts, converting routing imbalance into a denial-of-service attack vector; submission metadata: arXiv:2512.23995 (v1), submission date Tue, 30 Dec 2025, license CC BY 4.0.
Keep reading
JUPITER exascale powers brain mapping, climate, 6G and quantum NVIDIA · Jun 22 NAIRR pilot accelerates scientific AI research with NVIDIA DGX NVIDIA · Jun 22 Eco Wave Power Uses NVIDIA AI To Harness Wave Energy NVIDIA · Jun 22 Nordic data centers pioneer sustainable cooling and heat reuse atNorth · Jun 22
Telborg · US Data Centers
Track the US data-center buildout — every day.

Real-time verified news and daily AI-written briefings, built from primary sources — power, grid, permits, land, financing. Start free.

Get Telborg Pro · $189/mo Get the daily briefing — free →