Quantum Advantage and Benchmarks in 2026: Willow's Below-Threshold Result, Logical Qubits, and What MBQC Changes

For most of the 2010s the quantum benchmark conversation was a noisy mess. Vendors picked metrics that flattered their hardware, comparisons across architectures were nearly impossible, and the phrase “quantum supremacy” did more reputational damage to the field than any single technical setback. The benchmark landscape in 2026 is meaningfully clearer because three things finally happened: Google demonstrated error correction below threshold on Willow in December 2024, Quantinuum and Microsoft reported fault-tolerant logical qubits at multiple code distances, and the field broadly accepted that you need to publish both the algorithm and the error budget if you want the result taken seriously.

This is the engineer’s tour of what the credible benchmarks measure and what they actually mean for the next few years.

Why old benchmarks stopped working#

The 2019 Google “supremacy” result on Sycamore — random circuit sampling that allegedly took 10,000 years on a classical computer — got progressively eroded by improved classical simulators through 2020-2023, to the point where the headline number was clearly wrong but the qualitative claim was still defensible. IBM’s response of pushing quantum volume — a holistic metric measuring the largest random circuit a machine could run with reasonable fidelity — was useful for a few years but ran out of headroom when machines crossed the threshold where the metric saturated.

The deeper problem with both metrics is that random circuits are not the workloads anyone actually wants to run. They are stress tests, useful for hardware characterization, not proxies for whether the machine can do chemistry simulation or factor a number. By 2024 the field had broadly accepted that the only benchmarks that matter are ones tied to actual algorithms with stated error budgets — and that the headline number anyone should care about is logical qubit count.

Willow and error correction below threshold#

The Google Willow result published in Nature in December 2024 is the most important quantum benchmark of the last several years. The experimental setup: take the surface code, encode a single logical qubit at code distances 3, 5, and 7 on the same physical chip, and measure how the logical error rate scales. The theoretical prediction below threshold is exponential suppression — every increase in code distance by 2 should drop the logical error rate by roughly a factor of Λ (lambda), where Λ is around 2 for well-behaved hardware.

What Willow showed: a Λ of approximately 2.14 across the d=3, 5, 7 progression. The logical error rate dropped from roughly 3 percent to 1.4 percent to 0.6 percent. That is the textbook signature of being below threshold, demonstrated on real hardware for the first time. The implication is that scaling up the code distance and adding more physical qubits will continue to drive logical error rates down — which is the whole engineering thesis of fault-tolerant quantum computing.

Quantum processor close-up showing a surface code lattice with code distance highlighted

The careful caveats: this was a single logical qubit, not a logical gate between two of them, and the physical error rate on Willow is still close to the threshold rather than far below it. Going from one logical qubit at distance 7 to a hundred logical qubits running a useful algorithm is still a journey measured in years. But the demonstration that the curve bends in the right direction is what the field needed.

IBM and the quantum volume trade-off#

IBM’s published quantum-volume numbers — 512 on the Heron family, with claimed 2 to the power of 9 successful random-circuit width — became less central to IBM’s own messaging through 2024 and 2025 as the company pivoted toward logical qubit demonstrations and the qLDPC roadmap. The honest read is that quantum volume served its purpose: it forced vendors to disclose error rates and circuit depths in a comparable way. Its successor as the headline number is logical qubit count.

The IBM result that does matter is the runtime versus fidelity trade-off. IBM’s “dynamic circuits” support — mid-circuit measurements with conditional logic — enables real-time error syndrome extraction and is the prerequisite for error correction at scale. The 2024-2025 work demonstrating real-time qLDPC decoding on Heron is the kind of engineering result that does not make headlines but underwrites the path to fault tolerance more than any benchmark number.

Quantinuum, ion-trap logical qubits, and the carbon code#

Quantinuum has been the most aggressive about publishing logical-qubit results. The April 2024 paper with Microsoft on the H2 hardware demonstrated 12 logical qubits using the carbon code, with the logical error rate roughly 800 times lower than the physical error rate. The follow-up work through 2025 pushed to higher logical-qubit counts and demonstrated logical entanglement and basic logical gates.

The ion-trap advantage that shows up in these benchmarks is all-to-all connectivity and very high physical fidelity. The disadvantage is gate time — ion-trap circuits run at millisecond rather than microsecond pace, which limits the wall-clock shot count and therefore the statistical confidence per published result. Microsoft and Quantinuum’s joint roadmap targets hundreds of reliable logical qubits by 2027-2028.

Atom Computing, neutral atoms, and the count race#

The Atom Computing 1180-qubit array announced in late 2024 was the largest single-platform qubit count in any technology. Neutral atoms — held in optical tweezer traps and manipulated with Rydberg interactions — scale far more cheaply in physical qubit count than superconducting or ion-trap, with the trade-off being that two-qubit gate fidelities have historically been the weakest of the three leading platforms. The Atom Computing and Microsoft 2024 partnership demonstrated 24 logical qubits, which at the time was the highest count of any platform.

The follow-up question is whether neutral-atom fidelities will catch up to ion-trap and superconducting through 2026-2027. The published results from Atom Computing, QuEra, Pasqal, and Infleqtion all point to that being plausible. If it lands, the neutral-atom platform’s economics — much cheaper per physical qubit, much easier to scale to thousands of qubits — become the leading path to large fault-tolerant machines.

MBQC and the architectural alternative#

Measurement-Based Quantum Computing is the architectural model that PsiQuantum and a handful of others are pursuing. The contrast with the gate-based model: in gate-based computing, you prepare qubits, apply a sequence of gates, and measure at the end. In MBQC, you prepare a large entangled cluster state up front, then measure qubits one by one in adaptively-chosen bases, and the pattern of measurements implements the computation.

Cluster state and measurement-based quantum computing motif

The advantage MBQC gives is that the hard part — preparing the entanglement — can be done offline and in parallel, and the on-the-fly computation is just measurements and feed-forward classical control. For photonic platforms where you can mass-produce entangled photon pairs but cannot reliably store them for long, this maps to the physics. For superconducting or ion-trap, the gate-based model is the better fit. The two models are theoretically equivalent — any MBQC computation can be expressed as a gate-based circuit and vice versa — but the engineering trade-offs are very different.

PsiQuantum’s bet is essentially that MBQC on silicon photonics is the path to a million-qubit fault-tolerant machine. Whether that bet pays in 2027-2028 is one of the most-watched questions in the field.

What “advantage” should actually mean in 2026#

The phrase quantum advantage is most credible when constrained to specific problem classes with concrete error budgets. The honest 2026 framing:

Random circuit sampling. Demonstrated quantum advantage in the sense that the best classical simulators are slower than Willow on the same task. Almost certainly not useful for any practical workload.

Quantum simulation of small molecules. Demonstrated in the 1-100 qubit regime; logical qubits would extend this to much larger systems. Real path to chemistry and materials applications, but on a fault-tolerant timeline.

Quantum machine learning. The “exponential speedup” claims from 2019-2021 have largely not held up under classical-algorithm improvements. The defensible 2026 position is that quantum ML is interesting research but not a near-term advantage candidate.

Factoring and cryptanalysis. Many thousands of logical qubits and many hours of runtime required to break 2048-bit RSA. Roadmap-credible by mid-2030s, not before. Which is why the post-quantum migration is the only piece of the quantum story that has a 2026 deadline.

Optimization. The benchmarks have been the messiest here. QAOA and quantum annealing results have not consistently beaten the best classical heuristics. The 2026 read is that quantum optimization is research-stage.

How a platform team should use this in 2026#

For most enterprise platform teams the practical conclusion is narrow. Pay attention to the quantum computing landscape generally. Track Google, IBM, Quantinuum, IonQ, PsiQuantum, Atom Computing, and the next logical qubit count milestones. Treat the benchmarks as evidence of engineering progress, not as buy signals for hardware. The post-quantum migration is the only urgent action.

For our cloud infrastructure work, we tell platform leaders that the right knowledge investment in 2026 is having one person on the team who can read the Nature papers and translate the result to leadership. That is much cheaper than building a quantum team and gives the org the option to act when the curve bends. The curve, finally, looks like it might.

Benchmarks are finally honest about what they measure. If your organization needs help separating real quantum progress from press releases for executive briefings or roadmap planning, our cloud infrastructure team tracks the field for clients. Tell us what you are evaluating.

Why old benchmarks stopped working#

Willow and error correction below threshold#

IBM and the quantum volume trade-off#

Quantinuum, ion-trap logical qubits, and the carbon code#

Atom Computing, neutral atoms, and the count race#

MBQC and the architectural alternative#

What “advantage” should actually mean in 2026#

How a platform team should use this in 2026#

Related reading#

Related posts.

The Road to Fault-Tolerant Quantum Computing

Measuring AGI: ARC-AGI and the Benchmarks That Actually Matter

Quantum Error Correction: The Recent Breakthroughs