International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 9

India | Data Knowledge Engineering | Volume 8 Issue 4, April 2019 | Pages: 2000 - 2002


Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery

Preyaa Atri

Abstract: In the realm of data engineering, efficient data migration and transformation are pivotal. The Parquet Schema Expansion Migrator for BigQuery is a Python library designed to streamline the process of migrating column data from Parquet files to Google BigQuery tables, while expanding the BigQuery table schema to accommodate columns present in the Parquet data but missing from the BigQuery schema. This paper explores the problem of schema evolution in data warehouses, introduces the library as a solution, discusses its uses and impact, and outlines future enhancements and recommendations for robust data type management.

Keywords: BigQuery, Parquet, Schema Migration, Data Engineering, Cloud Storage, Data Transformation



Citation copied to Clipboard!

Rate this Article

5

Characters: 0

Received Comments

No approved comments available.

Rating submitted successfully!


Top