Open standard for data lineage collection. Apache-2.0 under Linux Foundation. Defines lineage events captured from Airflow, Spark, dbt, and other tools, ingested into backends like Marquez. The lineage protocol — not a backend itself.
OpenLineage is the open standard for data lineage collection — Apache-2.0 under Linux Foundation. Defines the protocol for emitting lineage events from Airflow/Spark/dbt/Flink and ingesting into backends like Marquez/DataHub/OpenMetadata. The spec, not a backend.
OpenLineage's positioning as a vendor-neutral standard is the trust differentiator: lineage events portable across tools without vendor lock-in. From a Trust Before Intelligence lens, this is the canonical primitive for cross-tool lineage propagation.
Event emission sub-100ms.
JSON event schema.
Backend-dependent.
Vendor-neutral spec; runs anywhere.
It IS lineage — strongest C.
Open spec; events inspectable.
Lineage as audit. 1/6 -> 3.
Lineage IS distributed tracing for data.
Backend-dependent.
Canonical lineage spec.
5/6 -> 4.
Best suited for
Compliance certifications
OSS spec; backend determines compliance.
Use with caution for
Marquez is OpenLineage's backend.
View analysis →DataHub ingests OpenLineage events as one input.
View analysis →Role: L3 lineage event spec.
Upstream: Pipeline tools emit events.
Downstream: Backends ingest events.
Mitigation: Add OpenLineage emitter to all pipelines.
OpenLineage's purpose.
Need Marquez/DataHub.
This analysis is AI-generated using the INPACT and GOALS frameworks from "Trust Before Intelligence." Scores and assessments are algorithmic and may not reflect the vendor's complete capabilities. Always validate with your own evaluation.