Feast vs Tecton: Picking a Feature Store (or Skipping One)
Feature stores solve a real problem and add real ops cost. When to deploy Feast, when to pay for Tecton, and when you don't need either.
Feature stores are one of those MLOps components that sound critical until you actually try to deploy one. The pitch is real: train/serve skew is a top-3 cause of ML production failures, and feature stores solve it. The reality is that they’re heavy infrastructure that pays off only for specific workloads.
We’ve deployed Feast for hospital ML pipelines and evaluated Tecton for banking fraud detection. Most projects we work on never need either — Postgres + good engineering does the job. Here’s how the decision actually plays out.
What problem feature stores solve#
In ML systems, you compute features in two places:
- Training: batch over historical data (Snowflake, Spark, dbt) to compute features for training examples.
- Serving: online compute per-prediction (read user_id → fetch their recent activity from Postgres → compute a feature vector → pass to model).
Train/serve skew = the features computed in training don’t match the features computed in serving, because the code lives in two places and drifts apart. Model trained on “average order value last 30 days” computed with one bucketing logic; serving computes with slightly different logic. Model performance silently degrades.
Feature stores solve this by being the single source of truth for feature definitions. You write the transformation once; the store materializes it both for training (offline store) and serving (online store).
What’s actually different#
| Dimension | Feast | Tecton |
|---|---|---|
| License | Apache 2.0 (OSS) | Proprietary (commercial) |
| Hosting | Self-host or Feast Cloud | SaaS only |
| Pricing | Infrastructure cost | Per-seat + per-feature pricing |
| Online store | Redis, DynamoDB, Postgres, BigTable, MS SQL, others | Built-in |
| Offline store | BigQuery, Snowflake, Redshift, Postgres, Spark, others | Built-in (Spark-based) |
| Streaming features | Possible (via integrations) | First-class |
| Feature transformations | Pandas/PySpark/SQL | Their DSL + Pandas/Spark |
| Point-in-time joins | Yes | Yes (more mature) |
| Monitoring | DIY | Built-in |
| Compute | Bring your own (Spark, etc.) | Managed |
| Ops surface | Real (you operate stores + transforms) | Minimal (managed) |
| Production maturity | Growing | Mature (used at fintechs, retail) |
Where Feast wins#
Open source. No vendor lock-in. Apache 2.0. Your features, your code, your infrastructure.
Modular architecture. Pick your online store (Redis is the default), offline store (your existing warehouse), and compute engine. Doesn’t impose a stack.
Cheap at scale. Once running, marginal cost is just your infrastructure. No per-feature licensing.
Self-hostable in regulated environments. Healthcare, finance, government workloads where data can’t leave your network — Feast self-hosted works.
Where Feast hurts:
- Operational surface is real. You run the registry, the online store, the offline store, the materialization jobs.
- Streaming features are harder than batch. Most production Feast deployments are batch-only.
- Documentation and tutorials are decent but not as polished as Tecton’s enterprise materials.
Where Tecton wins#
Managed. No infrastructure to operate. Feature engineers can define features and ship them without involving platform team.
Streaming features are first-class. Real-time feature computation from Kafka/Kinesis with sub-second freshness. Genuinely hard to build yourself.
Point-in-time correctness is robust. Tecton handles backfills, late-arriving data, and time-window features more reliably than rolling your own.
Built-in monitoring. Feature drift, freshness SLAs, materialization health — all included in the product.
Enterprise support. Real SLAs, real support engineers, real documentation for compliance reviews.
Where Tecton hurts:
- Cost. Per-feature pricing at scale gets expensive fast. We’ve seen six-figure annual bills for moderate-sized deployments.
- SaaS-only. Data leaves your network. Some compliance regimes won’t allow this.
- Vendor lock-in. Your feature definitions live in their DSL.
- Tighter integration assumption — easier if your stack matches what Tecton expects.
When you actually need a feature store#
Honest assessment: most ML workloads don’t need one.
You probably need a feature store if:
- You have multiple teams building models that share feature definitions (“logged-in user behavior” is computed the same way for fraud detection and for recommendations).
- You have real-time features (current user session activity, current transaction velocity, etc.) — train/serve skew here is essentially guaranteed without a store.
- You have hundreds of features being computed for many models. Spreadsheet of “who owns which feature” becomes a real maintenance burden.
- You have strict feature governance requirements (PII features need access control, audit logging).
You probably DON’T need a feature store if:
- You have one model or a handful of models with simple, well-defined features.
- Your features are mostly batch-computed from your warehouse.
- Your team is small and the same engineers own training and serving code.
- You can express your features as SQL views in your warehouse and serve them via a simple cache.
For most workloads we deploy — a single fraud detection model for one bank, an HMS risk-scoring model for one hospital — we don’t deploy a feature store. We use:
- dbt models in the warehouse for feature definitions (see our dbt advanced patterns piece).
- Materialized views or scheduled jobs to push the latest features to a serving database (Postgres usually).
- The serving code reads from Postgres at inference time.
- A reconciliation test in CI: hash the training feature values vs serving feature values for a sample; alarm on mismatch.
This handles 80% of the train/serve skew problem with 5% of the operational complexity.
When we deploy what#
For new client work:
- Most projects: no feature store. Postgres + dbt + reconciliation tests.
- Multi-team ML platform: Feast self-hosted with Redis (online) + the client’s existing warehouse (offline). Adopt when you actually have 2+ teams building models.
- Real-time-heavy workloads with high feature complexity and budget: Tecton. Justify the bill with the eng-time savings.
We don’t usually start with a feature store. We add one when the pain of not having one becomes real — usually around the 5th model or the 2nd team.
Patterns that bite#
A few patterns we’ve seen go wrong:
Adopting Feast before you have multiple models. “We might need it later” is how you end up operating Redis + transformation jobs for one model that could have been a single SQL view. Add tooling when you have the problem it solves, not when you imagine you might.
Feature transformations in Feast that depend on data outside Feast. Feast was supposed to be the source of truth; a transformation that joins to data Feast doesn’t know about defeats the purpose.
No materialization SLAs. Feature pipelines that silently fail and serve stale features for a week. Alarm on materialization age, not just success/failure.
Online store sized for “test data.” Redis with 1GB memory works fine for prototypes; chokes when production traffic hits real feature volumes. Capacity plan early.
Lifetime “infinite” cache TTLs on features that should expire. A user’s “last 30 day order count” feature is wrong by day 31. TTL or refresh; don’t trust forever.
What we deploy by default#
For our typical ML projects:
- Small ML system (1-3 models, 1 team): features as dbt models, Postgres for serving, no feature store. Reconciliation tests in CI.
- Mid ML system (5-20 models, growing team): Feast self-hosted with Redis. Adopt when you hit the team-coordination cost.
- Large ML system (50+ features, real-time + batch, multiple regulated tenants): Tecton (if budget supports it) or Feast at scale.
For hospital ML and banking ML workloads where data residency matters, we always start with self-hosted options (no feature store or Feast self-hosted). Tecton SaaS is rarely an option for these clients.
The pattern of patterns#
Feature stores solve a real problem, but the problem is acutely felt by a subset of teams. For most teams, the simpler solution (dbt + Postgres + reconciliation) is the right answer.
The teams that ship ML reliably aren’t the ones with the most sophisticated MLOps stack. They’re the ones who deployed the minimum infrastructure that solves their actual problems and added complexity only when the cost of NOT having it became real.
Feature stores are over-recommended. Add one when the pain is real, not when the tutorial suggests it. If you’re sizing MLOps infrastructure for a real workload, our ML & MLOps team has shipped with and without feature stores. Tell us about the workload.