Course Overview
Snowflake For Data Engineers

Snowflake Billing

Lesson #6

In this lesson we will:

  • Learn about the Snowflake billing model and credit system;
  • Explain the various cost visiblity and cost control features of Snowflake;
  • Share some high level information about how minimise and control costs.

Consumption Based Pricing

Snowflake offers a genuine consumption based pricing model where you pay only for the services that you use.

There are three classes of resource which you will use and pay for:

Compute

When you execute queries or updates, you need processing power to carry out the requests. You will pay for this compute on a per second basis, with the actual rate per second being based on the amount of compute you need - e.g. an XL sized server for 180 seconds.

In practice, the compute is usually the largest cost in running a Snowflake deployment.

Storage

The data in your Snowflake instance will need to be saved on some persistent store which exists even when you are not running any compute resources. You will pay for this storage on a per byte basis - e.g. 108 GB of data.

Cloud Services

In order to work with your account, you will need to carry out various activities such as login, adminster users and run scheduled jobs. These are billed by Snowflake as a seperate line item.

Credit System

As discussed in a previous lesson, Snowflake is billed using a credit system. As you use Snowflake you will be using credits for your compute, storage and cloud service usage. These will then be billed at a per credit cost at month end.

The per credit cost is a function of the tier you are on, the cloud provider and the cloud provider region.

For instance, at the time of writing, an Enterprise user with their Snowflake instance deployed in AWS us-east region would be paying the following per credit cost:

TierCredit Cost
Standard$2.60
Enterprise$3.90
Business Critical$5.20

Whereas the same deployment in Google Cloud Platform London region would cost:

SyntaxDescription
Standard$2.70
Enterprise$4.00
Business Critical$5.40

This making GCP slightly more expensive with regards to your Snowflake credits if you need to host in that environment.

Virtual Warehouse Costs

Your compute costs will likely be the largest part of your Snowflake deployment. This is because storage is cheap, and because Snowflake pass along their storage costs without making a significant profit.

Snowflake has a number of different tiers of data warehouse which are structured on a T-Shirt size basis, each of which has a credit-per-hour associated cost.

SyntaxCredits Per Hour
X-Small1
Small2
Medium4
Large8
X-Large16
2X-Large32
3X-Large64
4X-Large128
5X-Large256
6X-Large512

Combining the two tables above, we can work out our costs per hour of running a 2X-Large warehouse in AWS eu-west:

TierCredit CostWarehouse SizeWarehouse Credits Per HourWarehouse Cost Per Hour
Standard$2.602X-Large3283.20
Enterprise$3.902X-Large32124.80
Business Critical$5.202X-Large32166.40

And the same for GCP:

TierCredit CostWarehouse SizeWarehouse Credits Per HourWarehouse Cost Per Hour
Standard$2.702X-Large3286.40
Enterprise$4.002X-Large32128.00
Business Critical$5.402X-Large32172.80

This shows that your actual costs are a function of the choices you make and your actual consumpion.

This can be challenging for businesses, as their final monthly price could be unpredictable depending on how much your users use Snowflake in reality. This is a common risk of consumption based pricing models.

Bespoke Arrangements

For larger deployments, it is possible to achieve discounts through direct negotiation with Snowflake. This will likely require pre-payment or agreements for compute and storage consumption.

Snowflake Tables For Billing

Snowflake makes available a lot of information about accured bills through internal tables. We will now take a quick tour of these to highlight the main ones:

WAREHOUSE_METERING_HISTORY

select * from WAREHOUSE_METERING_HISTORY

The WAREHOUSE_METERING_HISTORY view shows a record of when warehouses were started and stopped, and how many credits were used for compute and cloud services in the period.

Note that this table can take up to 3 hours to be updated.

QUERY_HISTORY

select * from QUERY_HISTORY

The QUERY_HISTORY table can be joined with WAREHOUSE_METERING_HISTORY table to understand which queries are ultimately consuming most compute and therefore generating the highest costs.

DATABASE_STORAGE_USAGE_HISTORY

select * from DATABASE_STORAGE_USAGE_HISTORY

The DATABASE_STORAGE_USAGE_HISTORY gives us information about storage costs.

Controlling Costs

The main level we have for controlling costs in Snowflake is to right-size the Virtual Warehouses.

Resource Monitors are a feature of Snowflake which can be used to assign budgets and then recieve alerts or close warehouses down if budgets are breached.

Next Lesson:
06

Snowflake Tasks

In this lesson we will learn about Snowflake tasks, which allow us to run database jobs on a schedule.

0h 15m



Continuous Delivery For Data Engineers

This site has been developed by the team behind Timeflow, an Open Source CI/CD platform designed for Data Engineers who use dbt as part of the Modern Data Stack. Our platform helps Data Engineers improve the quality, reliability and speed of their data transformation pipelines.

Join our mailing list for our latest insights on Data Engineering:

Timeflow Academy is the leading online, hands-on platform for learning about Data Engineering using the Modern Data Stack. Bought to you by Timeflow CI

© 2023 Timeflow Academy. All rights reserved