International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 3 | Views: 97 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Research Paper | Computer Technology | India | Volume 12 Issue 5, May 2023

Streamlining Enterprise Data Pipelines with an Automated DAG Factory for Airflow Orchestration in Cloud Environments using YAML Templates and JSON - Serialized Variables

Ramamurthy Valavandan [2] | Balakrishnan Gothandapani | Savitha Ramamurthy

Abstract: Airflow is an open - source platform for creating, scheduling, and monitoring data pipelines. Its Directed Acyclic Graph (DAG) factory provides a mechanism for creating and managing DAGs in a programmatic way. However, the current implementation of the DAG factory in Airflow requires writing Python code, which can be time - consuming and error - prone. In this research paper, we propose a YAML - based DAG factory automation framework for Airflow, which provides a simple and intuitive way to define DAGs in YAML format. We describe the design and implementation of the framework and provide examples of how it can be used to automate the creation and management of DAGs in a cloud environment. We also evaluate the performance and scalability of the framework using real - world datasets and compare it to the existing Python - based DAG factory in Airflow. Our results demonstrate that the YAML - based DAG factory automation framework provides a more efficient and flexible way to create and manage DAGs in Airflow, especially in large - scale data processing scenarios.

Keywords: Airflow, Directed Acyclic Graph, DAG factory, YAML, automation, Python, CLI tool, schema file, GCP, Composer, JSON, dictionary, task status, DAG tasks, template generation, variable

Edition: Volume 12 Issue 5, May 2023,

Pages: 656 - 673

How to Download this Article?

Type Your Email Address below to Receive the Article PDF Link

Verification Code will appear in 2 Seconds ... Wait