Airflow Sensor: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 15: Line 15:


==Reschedule==
==Reschedule==
The <code>reschedule</code> mode can be configured when the sensor is instantiated. The sensor takes up a worker slot only when it's checking, then frees the worker slot, sleeps for a set duration, then it is rescheduled on the worker slot. <code>reschedule</code> trades of latency for resources.  Something that is checking every minute should be in <code>reschedule</code> mode.
The sensor takes up a worker slot only when it's checking, then frees the worker slot, sleeps for a set duration, then it is rescheduled on the worker slot. <code>reschedule</code> trades of latency for resources.  Something that is checking every minute should be in <code>reschedule</code> mode.
 
The <code>reschedule</code> mode can be configured when the sensor is instantiated.
 
<syntaxhighlight lang='py'>
S3KeySensor(task_id='something', mode='reschedule', ...)
</syntaxhighlight>


==Smart Sensor==
==Smart Sensor==
{{External|https://airflow.apache.org/docs/apache-airflow/stable/concepts/smart-sensors.html}}
{{External|https://airflow.apache.org/docs/apache-airflow/stable/concepts/smart-sensors.html}}
There is a single centralized version of this sensor that batches all executions of it.
There is a single centralized version of this sensor that batches all executions of it.

Revision as of 22:29, 17 July 2022

External

Internal

Overview

A Sensor is a subclass of Operator. Sensors poll (wait and then periodically check) for an external event to happen. When the event they are waiting for occurs, the tasks succeeds, so their downstream tasks can run. The sensors are primarily idle, and because of that, they have primarily three modes of running, that allows executing them with various degrees of efficiency: poke, reschedule and smart sensors.

Also see Deferrable Operators and Triggers.

Sensor Types

Poke

poke is the default run mode for a sensor. The Sensor takes up a worker slot for its entire runtime and it sleeps between "pokes". Something that is checking every second should be in poke mode.

Reschedule

The sensor takes up a worker slot only when it's checking, then frees the worker slot, sleeps for a set duration, then it is rescheduled on the worker slot. reschedule trades of latency for resources. Something that is checking every minute should be in reschedule mode.

The reschedule mode can be configured when the sensor is instantiated.

S3KeySensor(task_id='something', mode='reschedule', ...)

Smart Sensor

https://airflow.apache.org/docs/apache-airflow/stable/concepts/smart-sensors.html

There is a single centralized version of this sensor that batches all executions of it.