Friday Oct 03, 2025

Solving the Cold Start Problem in AI Inference

In this episode of Inference Time Tactics, Rob, Cooper, and Byron sit down with Prashanth Velidandi, co-founder of InferX, to explore how serverless inference is tackling the AI “cold start problem.” They dig into why 90% of the model lifecycle happens at inference—not training—and how cold starts and idle GPUs are crippling efficiency. Prashanth explains InferX’s snapshot technology, what it takes to deliver sub-second cold starts, and why inference infrastructure—not just models—will define the next era of AI.



We talked about:

 

  • Why inference represents 90% of the model lifecycle, compared to the training focus most of the industry has.
  • How cold starts and idle GPUs create massive inefficiencies in AI infrastructure.
  • InferX’s snapshot technology that enables sub-second model loading and higher GPU utilization.
  • The challenges of explaining and selling deeply technical infrastructure to the market.
  • Why enterprises care about inference efficiency, cost, and reliability more than model size.
  • How serverless inference abstracts away infrastructure complexity for developers.
  • The coming explosion of multi-agent systems and billions of specialized models.
  • Why sustainable innovation in AI will come from inference infrastructure.



Connect with InferX

Prashanth Velidandi

https://inferx.net 

https://x.com/pmv_inferx 

https://www.linkedin.com/in/prashanth-velidandi-98629b115



Connect with Neurometric:
Website: https://www.neurometric.ai/ 

Substack: https://neurometric.substack.com/ 

X: https://x.com/neurometric/ 

Bluesky: https://bsky.app/profile/neurometric.bsky.social

 

Rob May

https://x.com/robmay 

https://www.linkedin.com/in/robmay

 

Calvin Cooper

https://x.com/cooper_nyc_ 

https://www.linkedin.com/in/coopernyc

 

Byron Galbraith

https://x.com/bgalbraith 

https://www.linkedin.com/in/byrongalbraith

Comment (0)

No comments yet. Be the first to say something!

Copyright 2025 All rights reserved.

Podcast Powered By Podbean

Version: 20241125