Airflow Concepts

From NovaOrdis Knowledge Base
Revision as of 01:33, 11 July 2022 by Ovidiu (talk | contribs) (→‎SubDAG)
Jump to navigation Jump to search

External

Internal

Workflow

DAG

https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html
Graph Concepts | Directed Acyclic Graph

SubDAG

https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#concepts-subdags

A DAG is made of tasks among which there are relations of dependency. The DAG is not concerned about what happens inside the tasks, it is only concerned about how to run them: order, retries, timeouts. etc.

Task

https://airflow.apache.org/docs/apache-airflow/stable/concepts/tasks.html

Tasks have dependencies on each other.

Task Dependencies

Task Types

Operator

https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html

Sensor

https://airflow.apache.org/docs/apache-airflow/stable/concepts/sensors.html

A Sensor is a subclass of Operator.

TaskFlow-decorated Task

https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html

Passing Data between Tasks

Tasks pass data among each other using:

  • XComs, when the amount of metadata to be exchanged is small.
  • Uploading and downloading large files from a storage service.

TaskGroup

https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#concepts-taskgroups

XComs

https://airflow.apache.org/docs/apache-airflow/stable/concepts/taskflow.html

"Cross-communications".

Workload

Scheduler

https://airflow.apache.org/docs/apache-airflow/stable/concepts/scheduler.html

Executor

https://airflow.apache.org/docs/apache-airflow/stable/executor/index.html

Worker

Metadata Database

Connections & Hooks

https://airflow.apache.org/docs/apache-airflow/stable/concepts/connections.html

Pool

https://airflow.apache.org/docs/apache-airflow/stable/concepts/pools.html