International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 5

United States | Data Knowledge Engineering | Volume 14 Issue 5, May 2025 | Pages: 1312 - 1315


Implementing Data Versioning and Lineage Tracking in ETL Workflows

Mounica Achanta, Dharanidhar Vuppu

Abstract: In modern data ecosystems, ensuring integrity, traceability, and auditability of data is critical to building trust and enabling compliance. This article explores practical strategies for implementing data versioning and lineage tracking within ETL (Extract, Transform, Load) workflows. It outlines versioning techniques such as snapshotting and Change Data Capture (CDC), as well as lineage tracking methods including direct capture and log - based inference. The paper also discusses how to leverage tools like Delta Lake, Apache Atlas, Open Lineage, and Neo4j to manage metadata and visualize data flow. Through real - world examples and implementation guidance, readers will learn how to design resilient, transparent ETL pipelines that support robust data governance and operational efficiency.

Keywords: Change Data Capture, Data Observability, Data Versioning, Metadata Management, and Open Lineage

How to Cite?: Mounica Achanta, Dharanidhar Vuppu, "Implementing Data Versioning and Lineage Tracking in ETL Workflows", Volume 14 Issue 5, May 2025, International Journal of Science and Research (IJSR), Pages: 1312-1315, https://www.ijsr.net/getabstract.php?paperid=SR25520072821, DOI: https://dx.doi.org/10.21275/SR25520072821


Download Article PDF


Rate This Article!


Top