Downloads: 17 | Views: 278 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Informative Article | Computer Science and Information Technology | United States of America | Volume 13 Issue 9, September 2024 | Popularity: 5.7 / 10
Leveraging Event - Based Architecture, AWS Step Functions, AWS Batch, and DynamoDB to Run ETL or ELT Jobs Concurrently While Allowing Granular Replay Capabilities
Akshay Prabhu
Abstract: Traditional Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) jobs are often perceived as hardware - intensive, necessitating the use of persistent EC2 instances to handle large data sets. This conventional approach presents challenges, including the need for manual monitoring of long - running jobs and the inability to replay jobs from specific points or stages in the ETL/ELT process. Additionally, the intricate nature of ETL/ELT phases, each with potential failure points, complicates the operational management of these workflows. AWS provides a suite of serverless services such as EventBridge, S3, SNS, Lambda, Step Functions, and Batch that can be leveraged to create a scalable and resilient ETL/ELT architecture. This paper explores how integrating these services can transform traditional ETL/ELT processes into a more flexible, state - managed saga (1) with granular replay capabilities. The goal is to offer insights into how this architecture using the above - mentioned AWS services can enhance traditional data processing workflows, focusing on concurrent job execution and precise error recovery, especially targeted for Software Architects and Engineers.
Keywords: Event - Based Architecture, AWS Step Functions, AWS Batch, Amazon, DynamoDB, ETL (Extract, Transform, Load), ELT (Extract, Load, Transform), Serverless Computing
Edition: Volume 13 Issue 9, September 2024
Pages: 25 - 28
DOI: https://www.doi.org/10.21275/SR24829051214
Please Disable the Pop-Up Blocker of Web Browser
Verification Code will appear in 2 Seconds ... Wait