Collibra.
Founded 2008 · Brussels, Belgium
Status · ● active
Enterprise data-and-AI governance incumbent: catalog, glossary, workflow stewardship, lineage, and a separate ML data-quality module.
Where it fits — and where it doesn't.
Large, regulated enterprises — banks, insurers, pharma, public sector — that need a governance-first control plane: a real CDO function, formal stewardship, a business glossary, policy enforcement, and auditable lineage for regulations like BCBS 239, GDPR, SOX, HIPAA, and the EU AI Act. Collibra is strongest where governance process and accountability matter more than developer ergonomics, and where a single vendor for catalog plus governance plus lineage plus data quality plus AI governance is preferred over best-of-breed point tools.
You are a startup, scaleup, or lean modern-data-stack team. There is no free tier, no trial, and no OSS path, and base subscriptions reportedly start around USD 170k/year before modules and services — with implementations commonly cited at six months. Avoid too if you want CI/pipeline-gating data quality (circuit breakers, pre-merge diffs, dbt-native tests): Collibra's quality module is governance-attached, not a developer-workflow tool, so pair or replace it with Datafold, Monte Carlo, Soda, or Anomalo for those needs.
The honest scorecard.
- The deepest governance and stewardship tooling in the cluster — a configurable workflow engine, business glossary, policies, ownership, and audit trails purpose-built for regulated enterprises
- Broad single-vendor footprint — catalog, lineage (table and column, OpenLineage-aware), an ML data-quality module (from the OwlDQ acquisition), privacy, and AI governance under one platform
- Strong automated lineage with root-cause and downstream impact analysis at table, column, and report level, with in-line transformation context
- A mature, analyst-recognised leader with 100+ catalog integrations and a large regulated-enterprise customer base
- Active investment in 2025–2026 via acquisitions (Raito for access management, Husprey for notebooks, Deasy Labs for unstructured/AI metadata) and an AI Copilot for natural-language search
- Opaque, high pricing — nothing published; Lineage and Data Quality licensed separately, with high services/people TCO
- Heavy and slow to deploy — implementations frequently cited at six months, needing dedicated admins and stewards
- Business-user adoption is a recurring complaint — a governance-first, technical UI; competitors win deals specifically on UX and time-to-value
- No OSS or self-host path, and total metadata-graph lock-in; SaaS-centric with weaker on-prem support
- The data-quality module is a separately-licensed add-on (from the OwlDQ acquisition), not a native testing engine — it is governance-attached
What Collibra actually is.
What Collibra is
Collibra is an enterprise data-and-AI governance platform built around a data catalog, a business glossary and semantic layer, a configurable stewardship workflow engine, automated lineage, and a separately licensed Data Quality & Observability module. Founded in 2008 by researchers from the Vrije Universiteit Brussel, it is one of the original category-defining governance incumbents, sold sales-led to large regulated enterprises and positioned in 2026 as “unified governance for data and AI.”
Where it fits
Collibra is the heavyweight governance incumbent, most directly cross-shopped with atlan (modern, UX-led, lower TCO), alation, and the OSS catalogs datahub and openmetadata (open, engineer-led, free self-host). It typically wins where formal governance, regulatory auditability, and single-vendor breadth outweigh developer ergonomics and price. Its data-quality module competes with monte-carlo, anomalo, bigeye, and soda, though those remain better for CI/pipeline-gating and dbt-native workflows.
On the data-quality module
Unlike a pure catalog, Collibra ships a genuine data-quality engine — Data Quality & Observability, built on its 2021 OwlDQ acquisition — with adaptive ML rules, anomaly detection, profiling, and remediation workflows authored in standard SQL. We score the quality cluster a 2 rather than a 3 because it is governance-attached: strong on ML detection and stewardship, but without the circuit-breaker, pre-merge diffing, and dbt-native testing that define dedicated observability tools. It is licensed as a separate module, which matters for cost.
How to evaluate it
Evaluation is sales-led and module-scoped. Decide first which modules you are actually buying (Catalog alone, or Catalog plus Lineage plus Data Quality), because that drives both fit and cost. Then test the governance workflow engine against a real stewardship process you run today — approvals, issue management, policy enforcement — since that is Collibra’s deepest differentiator, and budget realistically for the implementation timeline rather than the licence alone.
All capabilities by cluster.
Quality & testing
Secondary · strength 2/3Catalog & discovery
Primary · strength 3/3Lineage & metadata
Secondary · strength 3/3Where it plugs in.
Native warehouse support
The honest pricing breakdown.
Sales-only tier Enterprise — sales-led, with Lineage, Data Quality, Privacy and other modules licensed separately. Nothing is published; third parties report a base around USD 170k/year before modules and services.
What it doesn't do.
Halts downstream execution when a test fails — preventing bad data from propagating into marts, ML features, or BI dashboards. Requires tight integration with the orchestrator (Airflow, Dagster, dbt Cloud). Distinct from alerting-only tools which notify after damage is done.
Pre-Merge Diffing →Compares the output of a model change against production before the pull request is merged — showing row-level and aggregate differences. Shifts data quality left into the development workflow. Datafold is the category-defining tool here; dbt's own cloud offering has added similar capabilities. Requires production-scale compute on a development branch, which has cost implications.
dbt-Native Testing →Runs as part of the dbt execution context — as a package, post-hook, or artifact consumer — rather than monitoring the warehouse from the outside. Tests are defined in the same codebase as models, run on the same schedule, and fail the same CI pipeline. The alternative is warehouse-side monitoring (Monte Carlo-style) which catches issues dbt misses but reacts rather than prevents.
OpenLineage-Native →Emits and consumes OpenLineage events as a first-class citizen rather than via a plugin or adapter. Signals commitment to interoperability with other metadata tooling — Marquez, OpenMetadata, Astronomer, and others can consume the same event stream. Increasingly the differentiator between "open" and "proprietary metadata model" observability platforms.
Drill into one capability.
Other key features
If not Collibra, then what?
Common alternatives
Quick answers.
- Is Collibra open source?
- No. Collibra is a proprietary product.
- How much does Collibra cost?
- Collibra does not publish list pricing — it is sales-led, so you request a quote. There is no free tier.
- How is Collibra deployed?
- Collibra is a managed cloud (SaaS) product.
- Does Collibra work with dbt and my warehouse?
- It integrates with dbt via plugin. Collibra supports snowflake, databricks, bigquery, redshift, synapse, plus 3 more.
More catalog & discovery tools
Provenance.
Last verified 2026·05·30 against vendor documentation and, where possible, hands-on trial. Spot something off? Send a correction →