Why Airflow?

Nilesh Khandalkar
4 min readNov 26, 2022

Intro: Apache Airflow is an open source platform to programmatically author, schedule and monitor workflows. In simple language Apache Airflow is an orchestrator for creating dynamic data pipelines. Airflow data pipelines are dynamic, scalable, interactive. Airflow can be interacted in three ways: User Interface [Most commonly used], CLI [Command Line Interface] and Rest API [execute on button click from frontend]. Airflow is not a streaming solution like spark, if you have a terabyte of data to process, write the job in spark or BigQuery and then trigger it from Airflow.

--

--

Nilesh Khandalkar

Passionate about Data and Cloud, working as Data Engineering Manager at Capgemini UK. GCP Professional Data Engineering Certified Airflow Fundamentals Certified