Lesson Overview

In this lesson we will:

  • Learn about scheduled tasks including what they are and how to use them.

Scheduled Tasks

Tasks allow us to schedule activities to run within the Snowflake database, for instance hourly, daily or according to some arbitrary schedule.

A task could be as simple as a SQL statement, or could call into a stored procedure if we need more complex logic.

Use Cases

There are various use cases for tasks:

  • ETL - e.g. Periodically transferring and transforming ingested data;
  • Retention - e.g. Periodically deleting data which is no longer necessary;

Job Dependencies and The DAG

Snowflake tasks can be chained together and made dependent on one another.

For instance, A to B to C.

Integration With Streams

Snowflake has a feature called Streams which can be used to track when data is inserted, updated or deleted within tables.

Combining streams with tasks is a powerful combination of features, because we can periodically take actions only on data which has changed.

Streams are discussed in more detail in the next lesson.

External Tools

The advantage of Snowflake tasks is that they can be configured directly within Snowflake using SQL, and require no external tooling to learn or implement.

There are however more powerful and fully featured tools for job scheduling and dependency management which are more commonly used in practice.

DBT for instance is widely used to define transformation runs from a series of dependent tasks. [Airflow] is also a very commonly deployed tool as part of Modern Data Stacks which is more powerful but involves describing jobs in Python.

Summary

In this lesson we looked at Snowflakes scheduled task feature.

We described use cases including ETL and managing data retention processes.

We also looked into job depdendencies and the DAG concept, and explained how task management often happens outside of Snowflake using tools such as DBT or Airflow when building Modern Data Stack platforms.

Next Lesson

Description of next lesson here

Hands-On Training For The Modern Data Stack

Timeflow Academy is an online, hands-on platform for learning about Data Engineering and Modern Cloud-Native Database management using tools such as DBT, Snowflake, Kafka, Spark and Airflow...

Sign Up

Already A Member? Log In

Next Lesson:

Snowflake Streams

Prev Lesson:

Snowflake Billing

© 2022 Timeflow Academy.