Data Stack Index / v 02.06
Verified 2026·05·01
Send a correction
§ Capability · Data observability

ML anomaly detection.

Tools that learn statistical baselines for tables and surface deviations without pre-defined assertions.

Buyer importance · Important
Tools with this 10 of 21
Open-source options 2
Across clusters 2
Last updated 2026·05·01
01
What this is

What counts as ML anomaly detection?

Manually-authored tests can only catch failure modes someone thought to write. ML anomaly detection learns what "normal" looks like for each table — row counts, freshness intervals, value distributions — and alerts on deviations. Trade-off: requires a training window (typically 14–30 days) before the signal stabilizes, and highly seasonal data tends to produce false positives until the model has seen enough cycles.

02
Tools with this capability

10tools, grouped by primary cluster.

Cluster · Quality & testing 8 tools

Acceldata

Acceldata

Hybrid

Enterprise data observability with ML data quality, reconciliation, and a built-in catalog — strong on hybrid and on-prem estates.

Pricing
Contact sales
First strength
Broad single platform — ML data quality, reconciliation, catalog, governance/PII, and lineage in one product rather than a point tool

Anomalo

Anomalo

SaaS / Self-host

GUI-first ML anomaly detection at petabyte scale — pivoting in 2026 around agentic AI and unstructured-data monitoring.

Pricing
Contact sales
First strength
ML anomaly detection has a strong reviewer reputation in the cluster — Anomalo's profiling engine is purpose-built for petabyte-scale tables with minimal manual configuration

Bigeye

Bigeye

SaaS / Self-host

Enterprise data observability with Autometrics ML thresholds — repositioning in 2026 as an AI Trust Platform with runtime governance.

Pricing
Contact sales
First strength
Autometrics / Autothresholds — Bigeye's ML-based anomaly detection — has a strong reviewer reputation for low false-positive rates relative to peers in the cluster

Elementary

Elementary Data

OSS SaaS / Self-host

The dbt-native observability layer — tests, anomaly detection, and lineage that live inside your dbt project.

Pricing
OSS · free
First strength
Fully open-source core is genuinely production-grade, not a trial ramp to a paid tier

Metaplane

Metaplane (Datadog)

SaaS

ML-powered, no-code data observability for the dbt and warehouse stack with automatic column-level lineage — now Metaplane by Datadog.

Pricing
Published
First strength
ML anomaly detection that accounts for seasonality and trend, with very fast time-to-value (about fifteen-minute setup, alerts within days)

Monte Carlo

Monte Carlo Data

SaaS

Warehouse-side data observability for teams whose problems are upstream of dbt — ingestion, streaming, and across the full pipeline.

Pricing
Contact sales
First strength
Genuine breadth across the stack — ingestion, transformation, BI, ML in one surface

Sifflet

Sifflet

SaaS / Self-host

EU-built full-stack data observability pairing ML-driven monitoring with an embedded catalog and field-level lineage.

Pricing
Contact sales
First strength
Spans all three observability clusters in one product — monitoring, an embedded catalog, and field-level lineage

Soda

Soda Data

SaaS / Self-host

YAML-first data contracts and observability — SodaCL plus Soda Cloud, with anomaly detection and a self-hosted Kubernetes runner.

Pricing
From $750/custom
First strength
SodaCL is one of the cleaner data-quality DSLs — readable, version-controllable, and expressive enough for both simple assertions and ML thresholds
03
Compare tools with this capability

Head-to-head, side by side.

04
Other capability hubs

Drill into a different capability.

How this list is built.

Inclusion here is one boolean on each tool's structured profile — if a tool you'd expect is missing, the field is recorded false or not yet verified, never an editorial call. See the methodology for how each field is sourced.