Is Sifflet open source?

No. Sifflet is a proprietary product.

How much does Sifflet cost?

Sifflet does not publish list pricing — it is sales-led, so you request a quote. There is no free tier.

How is Sifflet deployed?

Sifflet can run as managed SaaS or be self-hosted.

Does Sifflet work with dbt and my warehouse?

It has a native dbt integration. Sifflet supports snowflake, bigquery, redshift, databricks, athena, plus 4 more.

Quality & testing · primary SaaS · Self-hosted Proprietary

Sifflet.

Name: Sifflet
Author: Sifflet

Sifflet
Founded 2021 · Paris, France
Status · ● active
Verified · ● 40d ago

EU-built full-stack data observability pairing ML-driven monitoring with an embedded catalog and field-level lineage.

Capability profile Quality 3/3 primary Catalog 2/3 Lineage 3/3

Annual cost

No public floor sales-led · quote only

Compare 10 alternatives → Visit homepage ↗ Read docs ↗

Deployment SaaS · Self-hosted

License Proprietary

Free tier No

dbt integration Native

Persona data engineer · analytics engineer

Company size scaleup → mid market → enterprise

Warehouses snowflake · bigquery · redshift · databricks +5

OpenLineage none

Verdict

Where it fits — and where it doesn't.

● Ideal for

Mid-market and enterprise data teams — especially in Europe — that want one platform spanning quality monitoring, an embedded catalog, and column-level lineage rather than stitching point tools together, with strong compliance posture (ISO 27001, SOC 2 Type 2, GDPR, single-tenant isolation, and a self-host option).

The combination of assertion rules, ML/dynamic anomaly detection, automated root cause, and a Flow Stopper circuit breaker makes it a credible single-vendor observability suite.

○ Avoid if

You need open-source or self-serve-priced tooling, transparent published pricing, native OpenLineage, formal data contracts, code-first value-level data-diff regression testing (use Datafold), or a heavyweight standalone governance catalog with automatic PII classification (Atlan or Collibra).

Sifflet deliberately avoids storing PII, which limits it as a governance catalog.

If Sifflet isn't the fit — consider

Monte Carlo vs Sifflet →Bigeye vs Sifflet →Anomalo vs Sifflet →

Strengths & weaknesses

The honest scorecard.

[+] Spans all three observability clusters in one product — monitoring, an embedded catalog, and field-level lineage
[+] Both assertion-based rules and ML/dynamic anomaly detection (dynamic freshness/volume, distribution change, proprietary time-series thresholds) to cut alert fatigue
[+] Automatic field-level (column-level) lineage via SQL query-log parsing across Snowflake, BigQuery, Redshift, and Databricks, plus BI tools
[+] Flow Stopper circuit breaker and Monitors-as-Code (CLI, YAML, Terraform provider, public API) fit engineering workflows
[+] Flexible deployment including fully self-hosted on Kubernetes, with ISO 27001, SOC 2 Type 2, GDPR, and single-tenant isolation

[−] No published pricing — every tier routes to sales
[−] Proprietary and closed-source, with no community or free self-host tier
[−] No native OpenLineage support and no formal data-contracts feature
[−] No automatic PII classification (by design, it avoids storing PII), which limits it as a governance catalog
[−] Field-level lineage automation is limited to four cloud warehouses, and it is not a value-level data-diff / regression-testing tool

Editorial

What Sifflet actually is.

What Sifflet is

Sifflet is a full-stack data observability platform from a Paris-based company. It covers three things in one product: data-quality monitoring (a large library of assertion-style and ML/dynamic anomaly monitors), an embedded data catalog, and end-to-end field-level lineage from ingestion through the warehouse and dbt to BI. Around that it layers automated root-cause analysis, incident management, a Flow Stopper circuit breaker, and a set of AI agents. It runs read-only against the source as managed SaaS, a hybrid agent model, or fully self-hosted.

Where it fits

Sifflet competes most directly with monte-carlo and bigeye as a full-stack, ML-driven observability suite, but leans harder into catalog and field-level lineage, giving it overlap with atlan and datahub on discovery. Against soda or great-expectations it is a managed, broader platform rather than a code-first testing framework; against datafold it does impact analysis in CI but not value-level data diffing. Its EU origin, GDPR posture, and self-host option are the clearest differentiators for European and regulated buyers.

On the three-cluster span

Spanning monitoring, catalog, and lineage in one product is genuinely unusual, and the field-level lineage (parsed from warehouse query logs across the four major cloud warehouses) is a real strength. The caveats are at the edges: the catalog is a competent embedded one, not a standalone governance platform — no automatic PII classification, by design — and lineage automation is limited to those four warehouses. Score it as a strong observability suite with a useful catalog attached, not as a catalog-first tool.

How to evaluate it

Connect it read-only to your warehouse and let the dynamic monitors learn before judging signal quality — the proprietary forecasting is meant to reduce alert fatigue on seasonal data, so give it a couple of weeks. Then test the two differentiators directly: trace a field end-to-end through the lineage graph (ingestion → warehouse → dbt → BI), and wire Flow Stopper into an Airflow DAG to confirm it actually halts a pipeline on a failing rule. If compliance is the driver, scope the self-host option into your security review early.

Capability spec

All capabilities by cluster.

Quality & testing

Primary · strength 3/3

01 dbt-native

02 ML anomaly detection

03 Assertion-based testing

04 Pre-merge diffing

05 Schema drift detection

06 Freshness monitoring

07 Volume monitoring

08 Custom SQL checks

09 Circuit breaker

10 Data contracts

11 Column profiling

12 Runs in CI

13 Root cause analysis

14 Incident management

Test authoring code first plus gui

Paradigm both

ML training window Proprietary time-series forecasting learns historical patterns to set dynamic thresholds for freshness, volume, and distribution

Monitors at warehouse table · warehouse column · bi dashboard · pipeline task

Alerting slack · teams · email · pagerduty · jira · webhook

Catalog & discovery

Secondary · strength 2/3

01 Business glossary

02 Glossary linked to assets

03 Natural language search

04 Ownership tracking

05 Data contracts

06 Governance workflows

07 Access request workflow

08 PII auto-classification

09 Tag propagation

10 Free self-hosted

Metadata ingestion pull connectors

Search approach keyword

Connectors 32+

Asset types tables · dashboards · pipelines

Lineage & metadata

Secondary · strength 3/3

01 Cross-system lineage

02 Upstream source lineage

03 Impact analysis

04 Reverse impact analysis

05 Historical lineage

06 Lineage API

07 Lineage diff

Granularity both

OpenLineage none

Extraction query log parsing · dbt manifest · api push

Warehouses & integrations

Where it plugs in.

Native warehouse support

snowflakebigqueryredshiftdatabricksathenasynapsepostgresmysqlmssql

Orchestrators & pipeline tools

dbt-clouddbt-coreairflowfivetran

01dbt — Native

02Airflow — Plugin

03OpenLineage — none

04API access — full

05Terraform provider

06Public SDK — python

Pricing

The honest pricing breakdown.

Pricing model per asset

Charged per per asset

Published ○ Contact sales required

Free tier ○ No

OSS self-host ○ Not available

Sales-only tier Entry (up to 500 assets) / Growth (up to 1,000) / Enterprise (1,000+) — all contact-sales; a free-trial CTA exists but with no published terms

Full Sifflet pricing breakdown — model, cost factors, alternatives by price →

Notable missing

What it doesn't do.

OpenLineage-Native →

Emits and consumes OpenLineage events as a first-class citizen rather than via a plugin or adapter. Signals commitment to interoperability with other metadata tooling — Marquez, OpenMetadata, Astronomer, and others can consume the same event stream. Increasingly the differentiator between "open" and "proprietary metadata model" observability platforms.

Data Contracts →

Explicit, versioned agreements between data producers and consumers specifying schema, semantics, SLAs, and breaking-change policy. Enforced in CI for producers and at consumption time for consumers. Distinct from schema validation alone — a contract captures intent, not just structure. Implementations vary wildly; many tools claiming "data contracts" offer only schema checks.

PII Auto-Classification →

Automatically identifies columns likely to contain personally identifiable information — email addresses, phone numbers, national IDs — through regex, name heuristics, or ML. Required for meaningful compliance workflows at scale. Quality varies: naive implementations produce heavy false-positive rates. Worth asking vendors about their accuracy benchmarks.

Pre-Merge Diffing →

Compares the output of a model change against production before the pull request is merged — showing row-level and aggregate differences. Shifts data quality left into the development workflow. Datafold is the category-defining tool here; dbt's own cloud offering has added similar capabilities. Requires production-scale compute on a development branch, which has cost implications.

Strong at

Drill into one capability.

dbt-native testing → ML anomaly detection → Pre-merge diffing → Circuit breaker → Business glossary → Column-level lineage →

Other key features

Warehouse-Native Monitoring Reverse Impact Analysis

Alternatives & migrations

If not Sifflet, then what?

Common alternatives

Monte Carlo → Genuine breadth across the stack — ingestion, transformation, BI, ML in one surface ↔ Sifflet vs Monte Carlo

Bigeye → Autometrics / Autothresholds — Bigeye's ML-based anomaly detection — has a strong reviewer reputation for low false-positive rates relative to peers in the cluster ↔ Sifflet vs Bigeye

Anomalo → ML anomaly detection has a strong reviewer reputation in the cluster — Anomalo's profiling engine is purpose-built for petabyte-scale tables with minimal manual configuration ↔ Sifflet vs Anomalo

Soda → SodaCL is one of the cleaner data-quality DSLs — readable, version-controllable, and expressive enough for both simple assertions and ML thresholds ↔ Sifflet vs Soda

See all 10 Sifflet alternatives, scored and compared →

Common questions

Quick answers.

Is Sifflet open source?: No. Sifflet is a proprietary product.
How much does Sifflet cost?: Sifflet does not publish list pricing — it is sales-led, so you request a quote. There is no free tier.
How is Sifflet deployed?: Sifflet can run as managed SaaS or be self-hosted.
Does Sifflet work with dbt and my warehouse?: It has a native dbt integration. Sifflet supports snowflake, bigquery, redshift, databricks, athena, plus 4 more.

More quality & testing tools

Acceldata Anomalo Bigeye Datafold dbt-expectations Elementary Great Expectations Metaplane Monte Carlo Soda All quality & testing →

Provenance.

Last verified 2026·05·30 against vendor documentation and, where possible, hands-on trial. Spot something off? Send a correction →

No paid placementNo vendor submissionsRankings never for sale Independence policy →