Airflow Sensor

From NovaOrdis Knowledge Base
Revision as of 23:26, 17 July 2022 by Ovidiu (talk | contribs) (→‎External)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

External

Internal

Overview

A Sensor is a subclass of Operator. Sensors poll (wait and then periodically check) for an external event to happen. When the event they are waiting for occurs, the tasks succeeds, so their downstream tasks can run. The sensors are primarily idle, and because of that, they have primarily three modes of running, that allows executing them with various degrees of efficiency: poke, reschedule and smart sensors.

Also see Deferrable Operators and Triggers.

Sensor Types

Poke

poke is the default run mode for a sensor. The Sensor takes up a worker slot for its entire runtime and it sleeps between "pokes". Something that is checking every second should be in poke mode.

Reschedule

The sensor takes up a worker slot only when it's checking, then frees the worker slot, sleeps for a set duration, then it is rescheduled on the worker slot. reschedule trades of latency for resources. Something that is checking every minute should be in reschedule mode.

The reschedule mode can be configured when the sensor is instantiated.

S3KeySensor(task_id='something', mode='reschedule', ...)

Smart Sensor

https://airflow.apache.org/docs/apache-airflow/stable/concepts/smart-sensors.html

There is a single centralized version of this sensor that batches all executions of it.

⚠️ Smart sensors are a deprecated early-access feature that will be removed in Airflow 2.4.0. It is superseded by deferrable operators, which offer a more flexible way to achieve efficient long-running sensors, as well as allowing operators to also achieve similar efficiency gains.