Data Stack Index
Updated 2026-04-25

Data quality
& testing

Tools for catching bad data — before it hits a dashboard, an ML model, or an executive.

Tools indexed
3 primary · 0 strong secondary
Open-source options
1
Sibling clusters
Catalog & discovery · Lineage & metadata

Data quality tooling splits cleanly along two fault lines. The first is where the tool lives: inside the dbt codebase (Elementary, dbt-expectations, Great Expectations), or outside it watching the warehouse (Monte Carlo, Bigeye, Metaplane, Anomalo). The second is how it decides what’s wrong: explicit assertions you write, or ML models that learn normal and flag deviations.

These aren’t mutually exclusive — the mature teams run both paradigms — but picking the wrong primary tool for your context wastes a quarter and a budget. A team with all their logic in dbt and no ingestion-layer problems doesn’t need Monte Carlo’s warehouse-side surveillance. A team with Fivetran-plus-Airbyte-plus-custom-Python loading data that dbt never sees will be blind to most of their incidents with Elementary alone.

This page is organized to make that choice legible.

§01

Questions this page answers

01

Should I use dbt-native testing, warehouse-native monitoring, or both?

02

Do I need ML anomaly detection, or are assertions enough for my data?

03

Can I get what I need from an open-source tool, or is managed worth the money?

04

Which tools actually prevent bad data from propagating — versus only alerting?

05

How do teams typically move between these tools as they scale?

06

What does "data contracts support" actually mean vendor-by-vendor?

§02

Primary tools in this cluster

Deployment
License
Warehouse
SaaS or self-hosted

Datafold

Pre-merge data diffing and column-level lineage — the tool that shifts data quality left into the pull request.

Pricing
From $799/custom
Built for
analytics engineer
open source SaaS or self-hosted

Elementary

The dbt-native observability layer — tests, anomaly detection, and lineage that live inside your dbt project.

Pricing
OSS · free
Built for
analytics engineer
SaaS

Monte Carlo

Warehouse-side data observability for teams whose problems are upstream of dbt — ingestion, streaming, and across the full pipeline.

Pricing
Published, variable
Built for
data engineer
§03

Capability matrix

Tool ML anomaly detectiondbt-nativePre-merge diffingCircuit breaker Monitors at
Datafold table, column, dbt model
Elementary dbt model, table, column
Monte Carlo table, column, dbt model, pipeline task, bi dashboard, ml feature
§04

By specific capability

A note on coverage

Every tool listed here was verified by hand against vendor documentation and, where possible, hands-on trial. Capability claims are independent of vendor marketing language. When a capability is partial or caveated, the individual tool page explains how.

Notice something wrong? Send a correction. We log every edit and surface the last-verified date on every tool page.