Is Collibra open source?

No. Collibra is a proprietary product.

How much does Collibra cost?

Collibra does not publish list pricing — it is sales-led, so you request a quote. There is no free tier.

How is Collibra deployed?

Collibra is a managed cloud (SaaS) product.

Does Collibra work with dbt and my warehouse?

It integrates with dbt via plugin. Collibra supports snowflake, databricks, bigquery, redshift, synapse, plus 3 more.

Catalog & discovery · primary SaaS only Proprietary

Collibra.

Name: Collibra
Author: Collibra

Collibra
Founded 2008 · Brussels, Belgium
Status · ● active
Verified · ● 6d ago

Enterprise data-and-AI governance incumbent: catalog, glossary, workflow stewardship, lineage, and a separate ML data-quality module.

Capability profile Quality 2/3 Catalog 3/3 primary Lineage 3/3

Annual cost

~$170k / yr floor sales-led · estimated annual spend third-party spend data · third-party spend reports

Compare 9 alternatives → Visit homepage ↗ Read docs ↗

Deployment SaaS only

License Proprietary

Free tier No

dbt integration Plugin

Persona data steward · governance lead

Company size mid market → enterprise

Warehouses snowflake · databricks · bigquery · redshift +4

OpenLineage consumer

Verdict

Where it fits — and where it doesn't.

● Ideal for

Large, regulated enterprises — banks, insurers, pharma, public sector — that need a governance-first control plane: a real CDO function, formal stewardship, a business glossary, policy enforcement, and auditable lineage for regulations like BCBS 239, GDPR, SOX, HIPAA, and the EU AI Act.

Collibra is strongest where governance process and accountability matter more than developer ergonomics, and where a single vendor for catalog plus governance plus lineage plus data quality plus AI governance is preferred over best-of-breed point tools.

○ Avoid if

You are a startup, scaleup, or lean modern-data-stack team.

There is no free tier, no trial, and no OSS path, and base subscriptions reportedly start around USD 170k/year before modules and services — with implementations commonly cited at six months. Avoid too if you want CI/pipeline-gating data quality (circuit breakers, pre-merge diffs, dbt-native tests): Collibra's quality module is governance-attached, not a developer-workflow tool, so pair or replace it with Datafold, Monte Carlo, Soda, or Anomalo for those needs.

If Collibra isn't the fit — consider

Atlan vs Collibra →Alation vs Collibra →DataHub vs Collibra →

Strengths & weaknesses

The honest scorecard.

[+] The deepest governance and stewardship tooling in the cluster — a configurable workflow engine, business glossary, policies, ownership, and audit trails purpose-built for regulated enterprises
[+] Broad single-vendor footprint — catalog, lineage (table and column, OpenLineage-aware), an ML data-quality module (from the OwlDQ acquisition), privacy, and AI governance under one platform
[+] Strong automated lineage with root-cause and downstream impact analysis at table, column, and report level, with in-line transformation context
[+] A mature, analyst-recognised leader with 100+ catalog integrations and a large regulated-enterprise customer base
[+] Active investment in 2025–2026 via acquisitions (Raito for access management, Husprey for notebooks, Deasy Labs for unstructured/AI metadata) and an AI Copilot for natural-language search

[−] Opaque, high pricing — nothing published; Lineage and Data Quality licensed separately, with high services/people TCO
[−] Heavy and slow to deploy — implementations frequently cited at six months, needing dedicated admins and stewards
[−] Business-user adoption is a recurring complaint — a governance-first, technical UI; competitors win deals specifically on UX and time-to-value
[−] No OSS or self-host path, and total metadata-graph lock-in; SaaS-centric with weaker on-prem support
[−] The data-quality module is a separately-licensed add-on (from the OwlDQ acquisition), not a native testing engine — it is governance-attached

Editorial

What Collibra actually is.

What Collibra is

Collibra is an enterprise data-and-AI governance platform built around a data catalog, a business glossary and semantic layer, a configurable stewardship workflow engine, automated lineage, and a separately licensed Data Quality & Observability module. Founded in 2008 by researchers from the Vrije Universiteit Brussel, it is one of the original category-defining governance incumbents, sold sales-led to large regulated enterprises and positioned in 2026 as “unified governance for data and AI.”

Where it fits

Collibra is the heavyweight governance incumbent, most directly cross-shopped with atlan (modern, UX-led, lower TCO), alation, and the OSS catalogs datahub and openmetadata (open, engineer-led, free self-host). It typically wins where formal governance, regulatory auditability, and single-vendor breadth outweigh developer ergonomics and price. Its data-quality module competes with monte-carlo, anomalo, bigeye, and soda, though those remain better for CI/pipeline-gating and dbt-native workflows.

The metamodel, and why it locks you in

Everything in Collibra hangs off its operating model — the metamodel. Resources are organized into communities (typically business units), domains (typed containers), and assets, and every asset is an instance of an asset type descending from five out-of-the-box parents: Business, Data, Governance, Technology, and Issue. Asset types define which attributes and relations an asset can carry; scopes let communities customize types without forking the global model; statuses and workflows drive the approval lifecycle. This is Collibra’s deepest differentiator — a regulated enterprise can encode its actual accountability structure into the metadata graph — and its lock-in: years of custom asset types, relations, and workflow logic do not port to another catalog. Metamodel design is where implementations succeed or stall — budget explicit design time.

Data products and the two marketplaces

In Collibra’s model a data product is itself an asset type: a reusable package combining context (purpose, ownership, quality), the data (tables, views, or reports), controls (policies and quality checks), and access. Each is backed by a data contract — a governed asset whose YAML manifest can follow the Open Data Contract Standard. Business users shop for these in Data Marketplace, the in-platform storefront searching a curated subset of catalog assets. Don’t confuse that with marketplace.collibra.com, the public listings site for connectors, JDBC drivers, and partner-built extensions.

On the data-quality module

Unlike a pure catalog, Collibra ships a genuine data-quality engine — Data Quality & Observability, built on its 2021 OwlDQ acquisition — with adaptive ML rules, anomaly detection, profiling, and remediation workflows authored in standard SQL. We score the quality cluster a 2 rather than a 3 because it is governance-attached: strong on ML detection and stewardship, but without the circuit-breaker, pre-merge diffing, and dbt-native testing that define dedicated observability tools. It is licensed as a separate module, which matters for cost.

How to evaluate it

Evaluation is sales-led and module-scoped, with no self-serve trial: as of mid-2026 “try Collibra” resolves to a guided interactive product tour, and hands-on evaluation runs through a demo request or a scoped proof of concept. Decide first which modules you are buying — Catalog alone, or plus Lineage and Data Quality — because that drives both fit and cost. Then test the governance workflow engine against a real stewardship process you run today — approvals, issue management, policy enforcement — since that is where Collibra earns its price or doesn’t. Budget realistically: nothing is published, third parties consistently place the base subscription around USD 170k/year, and modules, services, and steward time land on top.

Capability spec

All capabilities by cluster.

Quality & testing

Secondary · strength 2/3

01 dbt-native

02 ML anomaly detection

03 Assertion-based testing

04 Pre-merge diffing

05 Schema drift detection

06 Freshness monitoring

07 Volume monitoring

08 Custom SQL checks

09 Circuit breaker

10 Data contracts

11 Column profiling

12 Runs in CI

13 Root cause analysis

14 Incident management

Test authoring sql

Paradigm both

ML training window Adaptive rules plus ML for outliers, drift, and silent schema changes (OwlDQ heritage); thresholds self-adjust over history

Monitors at warehouse table · warehouse column

Alerting email

Catalog & discovery

Primary · strength 3/3

01 Business glossary

02 Glossary linked to assets

03 Natural language search

04 Ownership tracking

05 Data contracts

06 Governance workflows

07 Access request workflow

08 PII auto-classification

09 Tag propagation

10 Free self-hosted

Metadata ingestion both

Search approach keyword

Connectors 100+

Asset types tables · columns · dashboards · reports · glossary terms

Lineage & metadata

Secondary · strength 3/3

01 Cross-system lineage

02 Upstream source lineage

03 Impact analysis

04 Reverse impact analysis

05 Historical lineage

06 Lineage API

07 Lineage diff

Granularity both

OpenLineage consumer

Extraction sql static analysis · openlineage events · api push

Warehouses & integrations

Where it plugs in.

Native warehouse support

snowflakedatabricksbigqueryredshiftsynapsefabricpostgresmssql

Orchestrators & pipeline tools

airflowdbt-cloud

01dbt — Plugin

02Airflow — Plugin

03OpenLineage — consumer

04API access — full

05Terraform provider

06Public SDK — python, java

Pricing

The honest pricing breakdown.

Pricing model per seat tiered

Charged per per seat

Published ○ Contact sales required

Free tier ○ No

OSS self-host ○ Not available

Sales-only tier Enterprise — sales-led, with Lineage, Data Quality, Privacy and other modules licensed separately. Nothing is published; third parties report a base around USD 170k/year before modules and services.

Full Collibra pricing breakdown — model, cost factors, alternatives by price →

Notable missing

What it doesn't do.

Circuit Breaker →

Halts downstream execution when a test fails — preventing bad data from propagating into marts, ML features, or BI dashboards. Requires tight integration with the orchestrator (Airflow, Dagster, dbt Cloud). Distinct from alerting-only tools which notify after damage is done.

Pre-Merge Diffing →

Compares the output of a model change against production before the pull request is merged — showing row-level and aggregate differences. Shifts data quality left into the development workflow. Datafold is the category-defining tool here; dbt's own cloud offering has added similar capabilities. Requires production-scale compute on a development branch, which has cost implications.

dbt-Native Testing →

Runs as part of the dbt execution context — as a package, post-hook, or artifact consumer — rather than monitoring the warehouse from the outside. Tests are defined in the same codebase as models, run on the same schedule, and fail the same CI pipeline. The alternative is warehouse-side monitoring (Monte Carlo-style) which catches issues dbt misses but reacts rather than prevents.

OpenLineage-Native →

Emits and consumes OpenLineage events as a first-class citizen rather than via a plugin or adapter. Signals commitment to interoperability with other metadata tooling — Marquez, OpenMetadata, Astronomer, and others can consume the same event stream. Increasingly the differentiator between "open" and "proprietary metadata model" observability platforms.

Strong at

Drill into one capability.

ML anomaly detection → Data contracts → Business glossary → PII auto-classification → Column-level lineage → OpenLineage support →

Other key features

Reverse Impact Analysis Transformation Lineage

Alternatives & migrations

If not Collibra, then what?

Common alternatives

Atlan → Polished UX and onboarding — consistently scores top in analyst rankings on time-to-value relative to peers ↔ Collibra vs Atlan

Alation → Category-defining catalog with behavioral, usage-ranked search and pioneering natural-language search ↔ Collibra vs Alation

DataHub → Best-in-class column-level SQL lineage parser (SQLGlot-based, benchmarked at 97–99% accuracy on standard corpora) ↔ Collibra vs DataHub

OpenMetadata → Highest connector count in the OSS catalog space (120+) — particularly strong on dashboards, ML, and pipeline systems ↔ Collibra vs OpenMetadata

See all 9 Collibra alternatives, scored and compared →

Common questions

Quick answers.

Is Collibra open source?: No. Collibra is a proprietary product.
How much does Collibra cost?: Collibra does not publish list pricing — it is sales-led, so you request a quote. There is no free tier.
How is Collibra deployed?: Collibra is a managed cloud (SaaS) product.
Does Collibra work with dbt and my warehouse?: It integrates with dbt via plugin. Collibra supports snowflake, databricks, bigquery, redshift, synapse, plus 3 more.

More catalog & discovery tools

Alation Amundsen Apache Atlas Atlan DataHub OpenMetadata Secoda Select Star Unity Catalog All catalog & discovery →

Provenance.

Last verified 2026·07·03 against vendor documentation and, where possible, hands-on trial. Spot something off? Send a correction →

No paid placementNo vendor submissionsRankings never for sale Independence policy →