In this lesson we will:

  • Learn about Snowflake streams, explaining the concept and discussing how they can be used in practice.

What Are Streams

Imagine if we have a table of data in our Snowflake database:


Over time, there are updates to the data, such as a price change a new row entered.


There are many situations where it is useful to respond to or report on the fact that the data was changed. Snowflake streams allow us to capture a log of these DML statements, showing the old value, the new value and metadata about the change.

Use Cases For Streams

So now we have demonstrated how the changelog is maintained on our table, how could we potentially use this data? A few common use cases we see are:

  • Reporting - We can report directly on the stream object, using the old value, the new value and metaadata such as the change time. This means that the underlying table can always contain the latest value, whereas the stream could contain the history.
  • Tiggering Downstream Actions - Tasks can be run to periodically process the new inserts and updates, for instance sending a message to a downstream system.
  • Audit - Auditing old and new data values for compliance purposes

Creating A Streams

Let's begin by creating a new table and inserting some data:

create table ingredients( name varchar, cost integer, unit varchar )
insert into ingredients values ( 'Cheese', 1, 'lb' )
insert into ingredients values ( 'Tomato', 3, 'lb' )
insert into ingredients values ( 'Onions', 9, 'lb' )

We can then create the stream:

create stream ingredients_stream on ingredients;

Initially there are no events in the stream. It only begins capturing the DML updates post creation:

select * from ingredients_stream;

However, if we insert one more row into the underlying table:

insert into ingredients values ( 'Anchovies', 9, 'lb' )

Then select the stream again:

select * from ingredients_stream;

We can see that the stream has populated with the DML event.

insert into ingredients values ( 'Anchovies', 9, 'lb' )

As you will see, the event shows that an insert took place, what the old values were and what the new values were. We could report directly on this stream, or identify that it has happened and use it to trigger business logic.

Though it appears this way, note that the stream doesn't actually contain or duplicate the data or use significant additional storage. This is because it uses an offset or pointer back to the original record.

Updates And Deletes

If we now perform an update and a delete against our source table:

update ingredients set unit = 'kg' where name = 'Tomato';
delete from ingredients where name = 'Anchovies';

Then reselect from the stream:

select * from ingredients_stream;

The new DML statements will again be reflected in the table:

insert into ingredients values ( 'Anchovies', 9, 'lb' )
Next Lesson:

Referential Integrity In Snowflake

In this lesson we will learn about the referential integrity features in Snowflake and why Snowflake lacks features in this area.

0h 15m

Continuous Delivery For Data Engineers

This site has been developed by the team behind Timeflow, an Open Source CI/CD platform designed for Data Engineers who use dbt as part of the Modern Data Stack. Our platform helps Data Engineers improve the quality, reliability and speed of their data transformation pipelines.

Join our mailing list for our latest insights on Data Engineering:

Timeflow Academy is the leading online, hands-on platform for learning about Data Engineering using the Modern Data Stack. Bought to you by Timeflow CI

© 2023 Timeflow Academy. All rights reserved