In this lesson we will:

  • Explain what a table engine is;
  • Introduce the four categories of table engines.

What Are Table Engines?

When we create a table in Clickhouse we need to choose an engine which is responsible for storing and querying the data behind the scenes.

This decision is fairly unique to Clickhouse, as most databases don’t have this concept or do not expose it directly to end users.

Different table engines are suitable for different use cases, data patterns and access patterns. It is therefore important to know broadly how they work and how to choose the appropriate one.

Table Engine Families

There are at least 25 table engines in Clickhouse, but they can be broadly grouped into four categories:

  • MergeTree Family - This is the table engine which is most commonly used. These engines work by accepting data inserts, and then performing some operation in the background to merge and optimise the data. Example engines in this family include the ReplacingMergeTree and the SummingMergeTree;
  • Log Family - These are suitable for append only log data where we have frequent inserts and reads and require fast parralell access to the data;
  • Integration - These engines provide an interface into other endpoints or databases rather than actually storing the data within Clickhouse. Example engines in this family include JDBC, AWS S3 and Kafka;
  • Special - These are a set of miscellaneous engines for ad-hoc requirements. Examples engines in this family include a Memory backed table and materialised views.

Specifying Your Table Engine

Table engines are specified at the table creation time.

Next Lesson:
05

MergeTree Engine Family

In this lesson we will deep dive into Clickhouse MergeTree family of engines.

0h 15m



Continuous Delivery For Data Engineers

This site has been developed by the team behind Timeflow, an Open Source CI/CD platform designed for Data Engineers who use dbt as part of the Modern Data Stack. Our platform helps Data Engineers improve the quality, reliability and speed of their data transformation pipelines.

Join our mailing list for our latest insights on Data Engineering:

Timeflow Academy is the leading online, hands-on platform for learning about Data Engineering using the Modern Data Stack. Bought to you by Timeflow CI

© 2023 Timeflow Academy. All rights reserved