Nvidia Blackwell in 2026: B200, GB200 NVL72, and the Vera Rubin Roadmap

Blackwell finally shipped at scale through 2025. The B200, the GB200 NVL72 rack, the Vera Rubin roadmap, and the secondary market for H100s.

Nvidia Blackwell in 2026: B200, GB200 NVL72, and the Vera Rubin Roadmap

Blackwell took a year longer than the original Nvidia messaging suggested to reach broad customer shipments. The design-flaw story that surfaced in late 2024 — a mask defect in the Blackwell die that required a respin — pushed material volume into the back half of 2025. By the time the 2026 fiscal year closed, Blackwell-class silicon was shipping in volume to all major hyperscalers, the GB200 NVL72 rack architecture had become the reference design for new AI datacenters, and the H100 and H200 secondary market had taken on a life of its own. Nvidia’s CES 2025 keynote layered the Vera Rubin roadmap on top of all of this, setting the next generation’s expectations.

This is the practical state of the Nvidia silicon story going into mid-2026.

What actually shipped#

The Blackwell family ships in several variants. The B100 and B200 are the dual-die GPU packages — two reticle-limited dies connected by a 10 TB/s die-to-die interconnect, presented to software as a single GPU. The B200 is the higher-power variant at around 1000 watts TDP. The GB200 superchip pairs two B200 GPUs with a Grace CPU on a single board over a 900 GB/s NVLink interface, and the GB200 NVL72 rack assembles 36 of these superchips — 72 B200 GPUs in total — into a single liquid-cooled rack presented to software as one large logical accelerator.

Blackwell B200 and GB200 architecture

The performance numbers Nvidia published for the GB200 NVL72 at GTC 2024 framed it as 30× faster than the equivalent H100 cluster for trillion-parameter inference workloads, with the gain coming primarily from the rack-level NVLink fabric eliminating the slow Ethernet hops between GPUs that the H100 architecture had to make. The training numbers were lower but still meaningful — roughly 4× the H100 cluster on the comparable workload.

The shipment story through 2025 ran from limited samples in Q1 through hyperscaler-scale deployments by Q3. Microsoft’s GB200 deployment for OpenAI training, Meta’s Grand Teton successor builds, Oracle’s Stargate clusters, and CoreWeave’s specialized AI-cloud expansion all consumed real volume. Through-the-window numbers from Nvidia’s quarterly earnings put Blackwell revenue past the H100 revenue baseline by mid-2025.

The 132-kilowatt rack#

The GB200 NVL72 rack draws around 132 kilowatts at peak — roughly five times what the densest H100 racks needed. This number is not a footnote. Most existing datacenter facility floors were designed for 10-20 kilowatts per rack. Operating at 132 kilowatts requires liquid cooling at the rack level, structural floor reinforcement, and a power distribution infrastructure that most colocation providers cannot deliver today.

The practical effect is that Blackwell deployment has been constrained as much by physical-facility readiness as by silicon supply. Hyperscalers building new datacenters specifically for GB200 have moved fastest. Enterprises with existing colocation footprints have had to either upgrade those sites at considerable cost or move workloads into AI-specialist providers like CoreWeave, Crusoe, Lambda, and Nebius whose new facilities were designed for the power envelope.

The design-flaw recovery#

The Blackwell mask defect surfaced through media reporting in August 2024. The defect required a respin of the silicon, pushed volume shipments roughly six months later than the original schedule, and produced one of the few public stumbles in Nvidia’s modern manufacturing record. The recovery through 2025 was quiet and complete — Nvidia did not publicly comment on the technical root cause, the respin came back clean, and the supply chain ramped through the year.

What the episode did expose was the concentration risk in the Nvidia supply chain. TSMC fabricated the wafers, ASE and other partners did the advanced packaging, and the dependence on TSMC’s CoWoS-L advanced packaging capacity meant that even with the silicon corrected, packaging throughput was a binding constraint through 2025. The CoWoS capacity expansion at TSMC continues to be the practical limit on how fast Nvidia can ramp Blackwell volume.

The Vera Rubin roadmap#

Jensen Huang’s CES 2025 keynote laid out the post-Blackwell roadmap with the Vera Rubin generation. The naming convention — Vera Rubin after the astronomer who pioneered dark matter observations — extends Nvidia’s pattern of naming GPU generations after scientists. The Rubin GPU and the Vera CPU are positioned for production shipment in late 2026 and through 2027. Initial Rubin variants are expected on TSMC N3X with a follow-up Rubin Ultra on TSMC N2 in the 2027 timeframe.

Vera Rubin and AI silicon roadmap

The performance positioning in the CES messaging put Rubin at roughly 3.3× the inference performance of Blackwell on comparable workloads, with the gain coming from a combination of process node shrink, memory bandwidth improvements via HBM4, and architectural changes to the tensor cores. The rack-level NVL576 architecture that Nvidia previewed scales the GB200 NVL72 idea to a much larger logical accelerator, with the cooling and power envelope going up correspondingly.

For procurement teams the practical question is whether to wait. The honest answer is that anyone able to deploy GB200 NVL72 in 2026 should — the silicon is shipping, the performance is real, and the wait for Rubin in 2027 means deferring training and inference work by another year. The exception is teams whose facility power envelope cannot accommodate Blackwell and who would have to build new infrastructure anyway, in which case skipping a generation may be defensible.

The H100 and H200 secondary market#

The Blackwell rollout produced a genuine secondary market for H100 and H200 hardware. The dynamic: hyperscalers retired H100 capacity into Blackwell, secondary buyers — smaller AI cloud providers, enterprises building private clusters, and specialized inference operators — picked up that capacity at meaningful discounts to original list price. By mid-2025 the H100 80GB SXM5 secondary price had dropped roughly 30-40% from the 2023 peak.

The economics of running an older-generation H100 cluster for inference workloads have become genuinely attractive for teams whose workload does not require Blackwell-class throughput. For RAG, classification, and most production LLM inference of models below the frontier tier, an H100 cluster bought into the secondary market and run for two to three more years can compete with the per-token economics of Blackwell on amortized total cost. Companies including Voltage Park, CoreWeave’s secondary tier, and several specialist GPU brokers have built businesses around this dynamic.

Where pdpspectra fits#

Our cloud infrastructure practice helps clients evaluate the Blackwell-versus-H100, on-prem-versus-cloud, and hyperscaler-versus-specialist procurement decisions. The right answer is almost never “buy the latest Nvidia generation outright” — for most enterprise workloads, on-demand GB200 NVL72 capacity from a specialist provider for training spikes plus a steady-state H100 inference footprint produces better unit economics than owning the silicon.

Related reading: AMD MI300X adoption, TSMC N2 process ramp, and open-source LLMs in production.

Closing#

Blackwell took longer to arrive than Nvidia messaged and has consumed more facility-readiness work than most enterprises anticipated. The silicon is now real, the rack architecture has matured into a reference design, and the Vera Rubin generation is twelve to eighteen months out. The H100 secondary market gives non-frontier workloads a credible alternative that did not exist in 2024.

For enterprises sizing AI infrastructure in 2026, the right question is rarely “should we buy Blackwell” and almost always “what mix of Blackwell, H100, and specialist-cloud capacity fits our workload mix and our facility envelope.” Talk to our team about your AI infrastructure plan.