International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 3 | Views: 74 | Weekly Hits: ⮙2 | Monthly Hits: ⮙2

Informative Article | Data & Knowledge Engineering | India | Volume 8 Issue 4, April 2019 | Rating: 4.9 / 10

Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery

Preyaa Atri [7]

Abstract: In the realm of data engineering, efficient data migration and transformation are pivotal. The Parquet Schema Expansion Migrator for BigQuery is a Python library designed to streamline the process of migrating column data from Parquet files to Google BigQuery tables, while expanding the BigQuery table schema to accommodate columns present in the Parquet data but missing from the BigQuery schema. This paper explores the problem of schema evolution in data warehouses, introduces the library as a solution, discusses its uses and impact, and outlines future enhancements and recommendations for robust data type management.

Keywords: BigQuery, Parquet, Schema Migration, Data Engineering, Cloud Storage, Data Transformation

Edition: Volume 8 Issue 4, April 2019,

Pages: 2000 - 2002

How to Download this Article?

Type Your Valid Email Address below to Receive the Article PDF Link

Verification Code will appear in 2 Seconds ... Wait