Paper Title : Creating Data Pipelines using Apache Airflow
ISSN : 2394-2231
Year of Publication : 2022
10.5281/zenodo.6828344
MLA Style: Creating Data Pipelines using Apache Airflow "Sameer Shukla" Volume 9 - Issue 4 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
APA Style: Creating Data Pipelines using Apache Airflow "Sameer Shukla" Volume 9 - Issue 4 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
Abstract
This Paper addresses the use of Apache Airflow in creating Data Pipelines, the paper gives an overview of what Apache Airflow is, basic building blocks like DAGs and Operators, explains how to create a simple pipeline using a realistic ETL use-case. Paper also briefly explains about Cloud Composer the fully managed service developed by Google Cloud Platform for Apache Airflow. Results from this study on Airflow suggests that using Apache Airflow can simplifies the Data Pipeline creation process as only pre-requisite to start using Airflow is the basic Python knowledge because the Operators in Airflow should be written in Python 3.6 or above.
Reference
[1] https://aws.amazon.com/managed-workflows-for-apache-airflow/ [2] https://www.astronomer.io/guides/intro-to-airflow/ [3] https://databand.ai/apache-airflow-monitoring-best-practices/ [4] https://airbnb.io/projects/airflow/ [5] https://docs.sentry.io/platforms/python/guides/airflow/
Keywords
— Apache Airflow, Directed Acyclic Graphs, Python, Pandas, Cloud Composer, Google Cloud Platform, Google Cloud Storage