AMD MI300X to MI350X: The Real State of AMD AI Adoption in 2026
Microsoft, Meta, and OpenAI all run AMD GPUs in production. The MI325X, the MI350X, the ROCm picture, and the Nvidia software moat reality.
The story of AMD’s AI silicon has shifted faster than most procurement teams realize. The MI300X launched in December 2023 with cautious commercial commitments from Microsoft and Meta. By the close of 2025 AMD had shipped meaningful volume across all three of Microsoft, Meta, and OpenAI; the MI325X had filled the mid-cycle refresh slot; the MI350X had launched with material design improvements; and specialist AI clouds including CoreWeave and Crusoe had built out genuine MI300X-class capacity alongside their Nvidia footprints. ROCm 6 and the in-flight ROCm 7 release have closed enough of the software-stack gap that AMD GPUs are now a credible option for many production workloads rather than a permanent science project.
This is the honest reading of where AMD AI silicon sits in 2026.
What actually shipped#
The product cadence through 2024 and 2025 ran cleanly. The MI300X launched in late 2023 with 192GB of HBM3 — meaningfully more memory per accelerator than the H100’s 80GB and competitive with the H200’s 141GB. The MI325X arrived in late 2024 as a mid-cycle refresh with 256GB of HBM3E and improved bandwidth. The MI350X launched in 2025 with the CDNA 4 architecture, 288GB of HBM3E, and the architectural changes — including FP4 and FP6 support — that AMD positioned as competitive with Blackwell on inference workloads.

The MI350X performance positioning at launch put it at roughly the same training throughput as the H200 on dense workloads and meaningfully ahead on memory-bound inference for large models, with the memory-per-accelerator advantage allowing single-GPU serving of models that required multi-GPU sharding on Nvidia hardware. AMD’s stated benchmark numbers — interpreted carefully — put the MI350X within roughly 30% of Blackwell B200 on the workloads where Blackwell wins and slightly ahead on the workloads where the memory advantage dominates.
The Microsoft, Meta, and OpenAI commitments#
The most consequential commercial wins for AMD have been the three large hyperscaler and frontier-model customers. Microsoft Azure ND MI300X v5 instances have been generally available since 2024, with confirmed Copilot inference workloads running on them. Meta has publicly confirmed Llama training and inference deployment on MI300X clusters in its production fleet. OpenAI announced in October 2024 an agreement to deploy AMD MI300X capacity, framed by Sam Altman as part of the diversification beyond Nvidia.
The Microsoft deployment is the largest and most-watched of the three. The Azure ND MI300X v5 instance pricing put AMD silicon at a meaningful discount to comparable H100 instances, and the practical experience reported by customers running both has been that for many production inference workloads the AMD instances deliver competitive throughput per dollar. The training workloads continue to be more Nvidia-anchored — the software stack gap on training is real and ROCm has more work to do — but for inference the gap has narrowed dramatically.
The ROCm 6 and 7 status#
The ROCm software stack has historically been the binding constraint on AMD AI adoption. The 2024 ROCm 6 release closed most of the production-readiness gaps for inference. PyTorch 2.x has first-class ROCm support, vLLM ships with ROCm backends, the major model-serving frameworks all work, and the porting friction for an inference workload originally developed on CUDA is now genuinely small.
ROCm 7, in active development through 2025 and 2026, focuses on training-workload parity. The training-side ergonomics have improved meaningfully but the gap with CUDA on advanced features — custom CUDA kernels in research code, the fastest fused-attention implementations, the long tail of CUDA-specific libraries — remains real. Teams running pure inference workloads can switch to ROCm with weeks of work; teams running cutting-edge training research still pay a meaningful tax to leave CUDA.
The Nvidia software moat reality#
The CUDA software moat is the single most-discussed and most-misunderstood part of the Nvidia story. The honest picture in 2026: the moat is real but it is narrower than the bull case suggests for inference and broader than the bear case suggests for training. For production inference of mainline open-source and frontier models, ROCm is now genuinely good enough that AMD’s silicon advantages can win on TCO. For training novel architectures and for the research workloads that produce frontier-model improvements, CUDA’s library depth, kernel maturity, and developer-ecosystem mass still dominate.
The interesting structural development is that the model-serving frameworks — vLLM, SGLang, TensorRT-LLM’s ROCm equivalents — have become the actual abstraction layer that matters for most enterprise teams. Once a workload is targeting vLLM rather than CUDA-direct, the underlying silicon choice becomes a procurement question rather than a code-rewriting question. This abstraction trend is what gives AMD its opening.
CoreWeave, Crusoe, and the specialist cloud picture#
The specialist AI cloud providers have been more aggressive AMD adopters than the major hyperscalers. CoreWeave added MI300X instances to its catalog in 2024 and expanded MI325X capacity through 2025. Crusoe Cloud has built dedicated MI300X clusters at multiple sites including its flagship Iceland facility. TensorWave is an AMD-only AI cloud that has built its entire business model on MI300X and MI325X capacity. Vultr has added AMD AI instances. Oracle Cloud Infrastructure has confirmed MI300X availability.

The pricing advantage the specialists offer on AMD silicon has been the strongest commercial signal that the supply-versus-demand story is real. AMD GPUs are easier to procure than Blackwell, the per-hour list price runs meaningfully below comparable Nvidia instances, and for inference workloads the throughput-per-dollar can win convincingly. For teams whose training workloads are anchored on Nvidia but whose inference economics are getting squeezed by GPU costs, splitting the workload — Nvidia for training, AMD for steady-state inference — has become a real pattern.
The unit economics#
The procurement math that matters for enterprises in 2026:
- Per-hour list price: AMD MI300X instances run roughly 25-40% below comparable H100 SXM5 instances at major providers
- Memory per accelerator: MI300X at 192GB and MI350X at 288GB versus H100 at 80GB means many workloads need fewer accelerators
- Throughput on production inference: roughly competitive with H100 for dense decoder-only models in the 70B to 405B range
- Training throughput: still meaningfully behind H100 and Blackwell on most workloads — pick Nvidia for training-heavy use cases
- Software setup cost: weeks of porting work for inference, months for training, near-zero for vLLM-based serving
Where pdpspectra fits#
Our cloud infrastructure practice helps clients evaluate the AMD-versus-Nvidia procurement decision for their workload mix. For most production inference workloads in 2026, including the Llama 4 and Mistral Large 3 deployments we ship, AMD silicon at a specialist cloud provider produces better unit economics than Nvidia on the major hyperscalers. For training workloads we still default to Nvidia. The split-workload pattern is where most of our 2026 client engagements land.
Related reading: Nvidia Blackwell shipping reality, TSMC N2 process, and open-source LLMs in production.
Closing#
AMD AI silicon has crossed the credibility threshold. The commercial commitments from Microsoft, Meta, and OpenAI are real; the ROCm software stack has matured enough to ship production inference workloads; the specialist clouds have made procurement easier than Nvidia in many cases; and the unit economics on inference workloads can be meaningfully better. The training-side gap with Nvidia remains real but is narrowing.
For enterprises sizing AI infrastructure in 2026, AMD deserves a real evaluation rather than the dismissive nod it would have gotten in 2023. Talk to our team about your AI silicon procurement.