International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 10

India | Computer Science and Information Technology | Volume 14 Issue 5, May 2025 | Pages: 1608 - 1610


A Study on Utilizing Delta Lake for Efficiently using LakeHouse

Ravi Rane, Pooja Mulik

Abstract: Delta Lake is an open - source storage layer that enhances data lakes with ACID transactional guarantees, scalable metadata handling, and unified batch/stream processing on Apache Spark. It has become integral to modern data architectures by providing reliability, schema enforcement, and support for time travel. However, achieving low - latency, high - throughput query execution over large - scale Delta tables require deliberate optimization across multiple system layers. This paper examines Delta Lake's underlying architecture including its transaction log, snapshot isolation model, and Parquet - based file layout and presents advanced performance tuning techniques. These include optimizing partitioning schemes for effective pruning, leveraging data skipping via file - level statistics, reducing file fragmentation through compaction, utilizing Spark caching for reuse, applying Z - order clustering for multi - column filtering efficiency, and maintaining compact, query - friendly metadata.

Keywords: Delta Lake optimization, transactional data lakes, big data architecture, Apache Spark performance, Z - order clustering

How to Cite?: Ravi Rane, Pooja Mulik, "A Study on Utilizing Delta Lake for Efficiently using LakeHouse", Volume 14 Issue 5, May 2025, International Journal of Science and Research (IJSR), Pages: 1608-1610, https://www.ijsr.net/getabstract.php?paperid=SR25525181650, DOI: https://dx.doi.org/10.21275/SR25525181650


Download Article PDF


Rate This Article!


Top