Course Overview
Orchestrating Data Platforms With Dagster


Lesson #7

In this lesson we will:

  • Learn about Dagster sensors;

What Are Sensors?

Many data platforms run on batch schedules, for instance jobs that run every hour, day or week. These are often triggered using Cron an ancient sheduler which is present on many Unix and Linux based systems.

Nowdays, businesses are looking to be much more dynamic than this, for instance processing data as it is arrived so that insights can be delivered to end users earlier.

Sensors are Dagsters solution to this, allowing us to trigger jobs when source data has changed. If new data arrives or if another job materialises an asset, the sensor can identify this and start the execution graph

Sensor Process

Sensors are Python functions that are decorated with the @Sensor decorator.

By default, a sensor will run every 30 seconds after the last run.

Each time the sensor runs, it will return either one or more RunRequest objects or a SkipReason.

Asset Sensors

An Asset Sensor is a special type of sensor that is triggered when an asset is materialised.

For intance, if we have a pipeline which materialises a set of cleaned data from some source system, we may wish to then trigger another run to calculate some derived analytics.

This allows us to break up our Dagster DAGs into seperate modules that do not have an explicit dependency on each other.

The sensor has a hard coded reference to the asset by asset key.

@asset_sensor(asset_key=AssetKey("my_table"), job=my_job)
Next Lesson:

Dagster Cloud

In this lesson we will introduce Dagster Cloud, Dagsters hosted managed service.

0h 15m

Continuous Delivery For Data Engineers

This site has been developed by the team behind Timeflow, an Open Source CI/CD platform designed for Data Engineers who use dbt as part of the Modern Data Stack. Our platform helps Data Engineers improve the quality, reliability and speed of their data transformation pipelines.

Join our mailing list for our latest insights on Data Engineering:

Timeflow Academy is the leading online, hands-on platform for learning about Data Engineering using the Modern Data Stack. Bought to you by Timeflow CI

© 2023 Timeflow Academy. All rights reserved