The Volta Gap | Hal Netsack

Somewhere out there, right now, a Fortune 500 company is running production AI workloads on Volta GPUs. V100s from 2017. Eight years old. Ancient in GPU terms.

This isn't incompetence. It's enterprise reality.

The Refresh Problem

Enterprise hardware refresh cycles run 5-7 years. GPU generations run 2 years. The math doesn't work.

By the time a large organization specs, budgets, procures, deploys, and stabilizes a GPU cluster, the hardware is already two generations behind. By the time it's paid off, it's four generations behind. By the time it's "end of life" in the asset system, Nvidia has shipped architectures the original buyers never imagined.

Volta → Turing → Ampere → Ada Lovelace → Blackwell. That's the gap a 2017 deployment is staring down.

Why It Persists

The dirty secret: for many workloads, old GPUs work fine.

Inference at modest scale doesn't need cutting-edge silicon. A V100 running a RAG pipeline in 2026 produces the same answers it did in 2020. The tensor cores still tensor. The VRAM still holds models. The drivers still get security patches.

The business case for ripping it out and replacing it is often "we could go faster" — which loses to "but this works and is paid for" in most procurement meetings.

The Opportunity

Eventually the gap becomes undeniable. Software demands more VRAM. New models won't fit. Support contracts expire. Someone in leadership reads about Blackwell and asks uncomfortable questions.

That's when the conversation starts. And the conversation is almost always: "What's the modern equivalent of what we have now?"

Not "what's the best" — just "what's the same, but current." That's the actual upgrade path for most enterprises. Not revolutionary. Evolutionary.

The Volta gap closes one refresh cycle at a time.