TiDB Distributed SQL in 2026: HTAP, Serverless, and When It Beats Aurora

TiDB is the distributed SQL database with HTAP built in. The architecture, the multi-region patterns, and when MySQL wire compatibility plus TiFlash beats Aurora or AlloyDB.

TiDB Distributed SQL in 2026: HTAP, Serverless, and When It Beats Aurora

If you have outgrown a single MySQL or Postgres primary, the next step is not always sharding the way you were taught in 2018. TiDB has spent six years quietly turning into the credible distributed SQL story for teams that want horizontal scale without rewriting applications or running a separate analytics database for dashboards. In 2026, with TiDB Serverless on AWS Marketplace and TiFlash maturing into a real HTAP column store, the conversation has shifted from “interesting Chinese database” to “what we ship when Aurora hits a ceiling.”

What TiDB actually is#

TiDB is three components pretending to be one MySQL-compatible database:

  • TiDB — the stateless SQL layer. Parses MySQL wire protocol, plans queries, pushes work down. Scale by adding pods.
  • TiKV — the distributed key-value store. Range-partitioned (regions of ~96 MB), Raft-replicated, Rust. This is your row store.
  • TiFlash — the columnar replica that hangs off TiKV via the Raft log. Same data, column layout, vectorized execution. This is your real-time analytics store.

Plus PD (Placement Driver) for metadata and region scheduling. The whole thing speaks MySQL 8.0 wire protocol, so most apps written against MySQL move with config changes only.

The trick worth understanding: TiFlash is not a separate ETL pipeline. It is a learner replica of TiKV regions. Writes hit TiKV, replicate via Raft, and TiFlash applies them to columnar storage. Sub-second freshness for analytics over operational data, without Debezium or Fivetran in the middle.

TiDB HTAP layout

Internals you will actually hit in production#

The Raft groups matter. Each TiKV region is its own Raft group with three (or five) replicas. PD constantly rebalances regions across stores to keep hot keys spread. The implications:

  • Hotspots are physical. A sequentially increasing primary key (auto-increment, timestamp) parks every write on one region leader. Use AUTO_RANDOM or shard the key. We have debugged enough P99 spikes traced to a single auto-increment ID to consider this the first slide of any TiDB onboarding.
  • Transactions are two-phase. TiDB uses percolator-style transactions with a timestamp oracle in PD. Optimistic mode by default in older versions, pessimistic since 4.0. Read your transaction isolation docs once and stop guessing.
  • TiFlash queries the right replica automatically. The optimizer decides per-table whether to fan out to TiKV or TiFlash based on cost. You can hint, but the default is reasonable.

The serverless offering (TiDB Cloud Serverless, formerly Serverless Tier) collapses this whole stack behind a connection string. You pay per request units (RUs) and storage, scale to zero, and never see TiKV. It runs on AWS in several regions including Singapore and Tokyo, which matters for international teams shipping across APAC.

Performance characteristics#

On a properly sized cluster (three PD, three TiKV, two TiDB, two TiFlash on 8-core nodes), expect:

  • Point reads at low single-digit milliseconds. Same shape as a sharded MySQL.
  • Point writes in the 5 to 15 ms range, dominated by Raft commit latency. Tuned clusters with NVMe and tuned Raft batching hit better.
  • Analytical queries on TiFlash — full-table aggregations on tens of millions of rows in well under a second. Not ClickHouse fast, but on the same operational data with sub-second freshness.
  • Throughput scales near-linearly with TiKV nodes until you hit network or PD bottlenecks. Real production clusters at PingCAP customers run hundreds of TB and millions of QPS.

The honest tail-latency story: TiDB has more moving parts than Aurora, so P99 is harder to flatten. Watch for region splits, hot regions, and leader transfers. The PD dashboard tells you all of this if you bother to read it.

Ops realities#

TiDB self-hosted is not light. You operate PD, TiKV, TiDB pods, TiFlash, plus TiUP (the deployment tool) or the TiDB Operator on Kubernetes. Backups via BR (Backup and Restore) to S3. CDC via TiCDC. Monitoring through the bundled Grafana plus the in-app dashboards.

For most teams under ten engineers, TiDB Cloud (Dedicated or Serverless) is the right call. Dedicated runs on your VPC peering and gives you most of the knobs. Serverless is the right starting point for new builds where you do not yet know the workload shape — the AWS Marketplace listing makes procurement straightforward for US and EU enterprises.

Upgrades are the boring part that bites you. Minor version upgrades are rolling and smooth. Major versions (6 to 7, 7 to 8) need a real staging dry run and a backup. Schema changes are online but DDL on huge tables can take hours.

When TiDB beats Aurora or AlloyDB#

There is no point pretending TiDB wins everywhere. The honest cases where we pick it:

  • Multi-region active-active. Aurora Global Database is one-region-writes. TiDB with placement rules can do regional-leader writes with cross-region replicas. Real for global SaaS with regulatory data-locality and cross-region read latency budgets.
  • HTAP without a second pipeline. When the same data needs to serve transactions and dashboards and your team does not want to operate Kafka plus CDC plus Snowflake. TiFlash collapses that.
  • Horizontal scale on MySQL workloads beyond the single-writer Aurora ceiling. We have migrated a few APAC fintech workloads from Aurora MySQL to TiDB Cloud when write QPS pushed past 60k and sharding-as-a-service was the only alternative.
  • Open source posture. Apache 2.0, no BSL drama, vendor neutral. Same engine self-hosted or on TiDB Cloud.

Cases where Aurora or AlloyDB still wins:

  • Single-region OLTP with no analytics need. Aurora is simpler. AlloyDB is faster on pure Postgres workloads.
  • Postgres-heavy apps. TiDB is MySQL-compatible, not Postgres. YugabyteDB is the closer comparison there.
  • Teams already deep in AWS RDS tooling and IAM. The integration density is real.
  • Workloads under 1 TB with predictable load. The distributed overhead is not free.

Multi-region deployment

Multi-region patterns#

The three deployments we ship most often:

  1. Single-region, three-AZ. Default. TiKV replicas spread across AZs. PD quorum tolerates one AZ loss. This is 80 percent of deployments.
  2. Two-region with a witness. Primary region has two replicas, secondary has one, plus a small PD-only witness region for quorum. Survives single-region loss with RPO near zero. Used for regulated workloads.
  3. Global with placement rules. Tables and partitions pinned to specific regions based on data residency rules. EU customer rows in eu-west, APAC in ap-southeast. Cross-region reads via follower reads or TiFlash replicas. Complex but real.

Placement Rules in SQL (ALTER TABLE ... PLACEMENT POLICY) makes this declarative since TiDB 6.0. Worth learning before you commit to a topology.

The cost story#

TiDB Cloud Serverless starts at a generous free tier and scales linearly with RUs. For a workload doing 100M read ops and 10M write ops per month with 100 GB stored, you land in the low hundreds of dollars. Aurora Serverless v2 at the same shape is similar or slightly less.

TiDB Cloud Dedicated is meaningfully more expensive than Aurora at small scale (three-node minimums) but flips at high write throughput where Aurora would force you into sharding. Self-hosted TiDB is the cheapest at large scale and the most expensive in engineering time. We tell clients to run the spreadsheet honestly.

Where pdpspectra fits#

We have stood up TiDB on TiDB Cloud Dedicated for a Southeast Asian payments platform that outgrew Aurora MySQL, and self-hosted on Kubernetes for an enterprise client with strict on-prem requirements. Both shipped with operational runbooks for region splits, hot-region remediation, and TiFlash sizing.

If you are deciding between Aurora, AlloyDB, and TiDB for a workload that is starting to feel single-primary-shaped, our data engineering team will run the comparison against your real query patterns and tell you which one we would ship. We do not have a preferred answer — we have a preferred way of arriving at one.

Distributed SQL is no longer experimental. The teams winning with it are the ones who picked it for the right reason. Tell us about your workload and we will tell you whether TiDB is the right fit.