Inference Time Tactics
A podcast exploring the emerging field of inference-time compute—the next frontier in AI performance. Hosted by the Neurometric team, we unpack how models reason, make decisions, and perform at runtime. For developers, researchers, and operators building AI infrastructure.
Episodes

Tuesday Aug 12, 2025
Tuesday Aug 12, 2025
In this episode of Inference Time Tactics, Rob and Cooper dig into the strategic trade-offs driving a major shift in AI: why some enterprises start with closed models like OpenAI or Anthropic, then move to open-source stacks. The team breaks down the challenges of switching and how inference-time compute is becoming a competitive differentiator. They also unpack why pricing is shifting, how governance will evolve for this new layer, and what Rob learned from reviewing 250 research papers on reasoning algorithms.
We talked about:
Insights from reviewing 250 research papers on reasoning algorithms.
Why enterprises start with closed models like OpenAI or Anthropic before moving to open-source stacks.
Challenges of switching stacks, including model fragmentation, capability gaps, and hardware choices.
Cost-performance trade-offs when choosing inference architectures.
How inference-time configuration can become a competitive differentiator.
The role of pricing shifts and vendor lock-in in AI adoption.
Emerging governance considerations for inference workflows.
The growing variety and complexity of inference-time techniques..
Benchmarking challenges for multi-step and reasoning tasks.
Why the lack of best practices makes inference optimization harder to operationalize.
Connect with Neurometric:Website: https://www.neurometric.ai/
Substack: https://neurometric.substack.com/
X: https://x.com/neurometric/
Bluesky: https://bsky.app/profile/neurometric.bsky.social
Hosts:
Rob May
https://x.com/robmay
https://www.linkedin.com/in/robmay
Calvin Cooper
https://x.com/cooper_nyc_
https://www.linkedin.com/in/coopernyc Comment end

Friday Aug 01, 2025
Friday Aug 01, 2025
Welcome to the very first episode of Inference Time Tactics — the podcast for builders, researchers, and engineers pushing the limits of AI performance.
In this kickoff conversation, hosts Rob May and Cooper (co-founders of Neurometric AI) break down why inference time compute is emerging as the third scaling law of AI — and why it matters more than ever.
They unpack:
What “inference time compute” really means (and how it differs from training and fine-tuning)
Why reasoning algorithms like best-of-N, chain of thought, and beam search are reshaping performance
How recent research — and OpenAI’s 2024 reasoning model — sparked an explosion of interest
The challenge of reliability (“three nines” and beyond) in multi-step agent workflows
Why open-source models may win big, and where inference fits at the edge
This is a technical, tactical deep-dive — but without the heavy math. If you’re building the next generation of AI systems, or just want to understand where the field is really headed, this episode is your starting point.
🔗 Learn more at neurometric.ai



