TIDE: Temporal Incremental Draft Engine for Self-Improving LLMs

          arXiv.org
          
            · February 06, 2026
          
          · ✓ verified

Jiyoung Park and co-authors have submitted an arXiv preprint presenting TIDE (Temporal Incremental Draft Engine), a serving-engine-native framework for online draft adaptation in LLM inference (arXiv:2602.05145, submitted 5 Feb 2026).

Main announcement: The paper introduces TIDE, which reuses target model hidden states as training signals to enable zero-overhead draft adaptation without reloading the target model; it adds adaptive runtime control to enable speculation and training only when beneficial and maps decoupled inference and training to heterogeneous GPU classes. The authors report up to 1.15x throughput improvement over static speculative decoding and a 1.67x reduction in draft training time across diverse real-world workloads. Submission metadata: arXiv:2602.05145 [cs.LG], v1 submitted 5 Feb 2026, PDF/HTML/TeX source available, licensed under CC BY 4.0.
Background and details: The article is a research paper (subjects: Machine Learning, Artificial Intelligence) and provides implementation-level design for integrating speculation into high-performance inference systems; links provided include PDF, HTML (experimental), and TeX source. No commercial product release or pricing is announced; correspondence is via the arXiv author contact link.

Read original source ↗