OpenLineage

L3 — Unified Semantic Layer Data Lineage Free (OSS, standard) Apache-2.0 · OSS

Open standard for data lineage collection. Apache-2.0 under Linux Foundation. Defines lineage events captured from Airflow, Spark, dbt, and other tools, ingested into backends like Marquez. The lineage protocol — not a backend itself.

AI Analysis

OpenLineage is the open standard for data lineage collection — Apache-2.0 under Linux Foundation. Defines the protocol for emitting lineage events from Airflow/Spark/dbt/Flink and ingesting into backends like Marquez/DataHub/OpenMetadata. The spec, not a backend.

Trust Before Intelligence

OpenLineage's positioning as a vendor-neutral standard is the trust differentiator: lineage events portable across tools without vendor lock-in. From a Trust Before Intelligence lens, this is the canonical primitive for cross-tool lineage propagation.

INPACT Score

26/36
I — Instant
5/6

Event emission sub-100ms.

N — Natural
3/6

JSON event schema.

P — Permitted
3/6

Backend-dependent.

A — Adaptive
5/6

Vendor-neutral spec; runs anywhere.

C — Contextual
5/6

It IS lineage — strongest C.

T — Transparent
5/6

Open spec; events inspectable.

GOALS Score

19/25
G — Governance
3/6

Lineage as audit. 1/6 -> 3.

O — Observability
4/6

Lineage IS distributed tracing for data.

A — Availability
3/6

Backend-dependent.

L — Lexicon
5/6

Canonical lineage spec.

S — Solid
4/6

5/6 -> 4.

AI-Identified Strengths

  • + Vendor-neutral standard
  • + Apache-2.0 LF governance
  • + Adopted by Airflow/Spark/dbt/Flink/Marquez/DataHub/OpenMetadata
  • + Active spec evolution

AI-Identified Limitations

  • - Spec, not a backend — needs Marquez/DataHub/OpenMetadata
  • - All flags false (it's a spec)

Industry Fit

Best suited for

Cross-tool lineageVendor-neutral lineage standard

Compliance certifications

OSS spec; backend determines compliance.

Use with caution for

Need a backend implementation alongside

AI-Suggested Alternatives

Marquez

Marquez is OpenLineage's backend.

View analysis →
DataHub

DataHub ingests OpenLineage events as one input.

View analysis →

Integration in 7-Layer Architecture

Role: L3 lineage event spec.

Upstream: Pipeline tools emit events.

Downstream: Backends ingest events.

⚡ Trust Risks

high Emission not configured in pipelines

Mitigation: Add OpenLineage emitter to all pipelines.

Use Case Scenarios

strong Cross-tool lineage with vendor-neutral standard

OpenLineage's purpose.

weak Standalone use without backend

Need Marquez/DataHub.

Stack Impact

L3 L3 vendor-neutral lineage spec.

⚠ Watch For

2-Week POC Checklist

Explore in Interactive Stack Builder →

Visit OpenLineage website →

This analysis is AI-generated using the INPACT and GOALS frameworks from "Trust Before Intelligence." Scores and assessments are algorithmic and may not reflect the vendor's complete capabilities. Always validate with your own evaluation.