Cloud and AI Infrastructure Cost Optimization: Strategies and Case Studies

arXiv.org · January 28, 2026 · ✓ verified

Saurabh Deochake has published a revised arXiv paper (v2) expanding its scope to AI/ML infrastructure and GPU cost optimization, updated 27 Jan 2026.

  • Main announcement: The author released Version 2 of the paper on 27 Jan 2026 (arXiv:2307.12479v2), significantly expanded to include AI/ML infrastructure and GPU cost optimization, updated with 2025 industry data and new case studies on LLM inference costs; title changed to reflect broader scope.
  • Background and key findings: The paper is a comprehensive review covering traditional cloud pricing models, resource allocation, model optimization techniques (quantization, GPU instance selection, inference optimization), and reports that GPU compute represents 40-60% of technical budgets for AI-focused organizations, LLM inference costs decreased ~10x annually since 2021, and organizations can achieve 50-90% cost savings; includes case studies from Amazon Prime Video, Pinterest, Cloudflare, and Netflix, and links to PDF/HTML/TeX and DOI.
Keep reading
EU position for 11th EU‑Egypt Association Council meeting Council of the EU · Jun 16 EU establishes strengthened screening framework for foreign investments Council of the EU · Jun 16 Annex: EU Budget Performance and Priorities for 2025 Council of the EU · Jun 16 European Commission AMPR 2025: Internal Control and RRF Annexes Council of the EU · Jun 16
Telborg · US Data Centers
Track the US data-center buildout — every day.

Real-time verified news and daily AI-written briefings, built from primary sources — power, grid, permits, land, financing. Start free.

Get Telborg Pro · $189/mo Get the daily briefing — free →