Lesson Overview

In this lesson we will introduce the concept of The Modern Data Stack, or what is sometimes referred to as the Modern Data Platform.

Timeflow Academy is based around tools and architectures of the Modern Data Stack, so this lesson attempts to provide a high level overview of this approach before looking into specific tools.

What Is The Modern Data Stack?

In order to completely meet the Data and Analytics requirements of a modern business, it is likely that you will need to combine multiple tools into an end-to-end solution. This includes tooling for ETL, data storage, querying, dashboarding, data science model building and other requirements.

As there is no one tool that does everything, it is likely that you will need to combine and integrate different technologies from different vendors. The collection of tools that you choose is sometimes informally referred to as a "Stack".

Modern Data Stack is a phrase that started to be used widely around 2020 onwards, as a set of new data tools and architectural practices emerged rapidly, just as businesses has an uplift in their demands for data and analytics. This period of rapid evolution and innovation led to new tools, technologies and approaches which today form the Modern Data Stack.

Characteristics Of Modern Data Stack Tools

Though there is no hard and fast definition of what exactly represents a Modern Data Stack tool, they would typically have the following characteristics:

  • Cloud Native - Modern Data Stack tooling typically runs in the Cloud as either a hosted, SaaS or IaaS solution. This avoids the needs for businesses to install and manage infrastructure and reduces requirements for operational management such as upgrades and backups;

  • Rapid Innovation - Because Modern Data Stack tools can typically be consumed as a service, they are typically very fast to deploy and require minimal configuration. This allows data teams to deliver value very quickly, and continue to place their efforts into high value initiatives specific to their business;

  • Scalable - Modern Data Stack tooling scales to support large volumes of data and large numbers of users. This is important as businesses continue to capture more and more data and have more complex use cases for it;

  • Open - Modern Data Stack tools are often Open Source or have an Open Source core with commercial addons. This is important to businesses that are looking to avoid lock-in to vendor technology as they modernise their tooling;

  • Composable - Where before there was a focus by data vendors trying to deliver an entire stack, the Modern Data Stack accepts that customers will want to combine best of breed tools into their overall solution. We often therefore see friendly collaboration between vendors in Modern Data Stack tooling;

  • Consumption Based Pricing - Most of the tools in the Modern Data Stack are charged for based on consumption, for instance based on data volumes processed or numbers of queries served. This means there are no large up-front costs, and cost should scale with usage and value delivered;

  • Accessible - Tools in the Modern Data Stack aim to be simple and easy to use, with a large focus on No-Code and Low-Code solutions. This means that all of the data professionals (Engineers, Scientists, Analysts etc) can contribute, rather than being restricted to their traditional siloed roles.

Of course, some tools have more of these characteristics than others, and some of them are quite subjective, meaning again that there is some debate as to whether a given tool qualifies as part of the Modern Data Stack.

Example Tools

As discussed, there are multiple components required to deliver a Modern Data Platform, which would typically be combined together to deliver the end to end capabilities that the business needs. Within each of these categories, various tools are commonly acknowledged to be part of the Modern Data Stack. These include:

CategoryTools
IngestionFivetran, Stich, Airbyte
Data WarehouseSnowflake, Azure Data Synapse, AWS Redshift, GCP BigQuery
Data LakeDatabricks, AWS, Azure, GCP
Parallel Data ProcessingSpark, Databricks
Data StreamingKafka, Flink
TransformationsDBT
Business IntelligenceTableau, PowerBI, Looker

This is not an exhaustive list, but tools such as the above are definetly front and centre in the conversation about the Modern Data Stack, and display many of the characteristics in our list above.

Benefits Of The Modern Data Stack

Ultimately, we believe that the Modern Data Stack is the way to go when building a new data platform. The tools considered to be a part of it are industry leading, and many of the architectural patterns and features such as low-maintenence and low-code are desirable.

The Modern Data Stack is a better platform and approach for data innovation. It allows you to rapidly get data and analytics products to market and to deliver value to businesses quickly.

Modern Data Stack solutions should be delivered with a lower total cost of ownershi. Data build with Modern Data Stack principles will have a dramatically lower cost profile when all infrastructure, staff and license costs are taken into account. The pricing model also means you can avoid large up front capital costs, and try new ideas and initiatives without a high financial cost of failure.

Ultimately, this is about supporting and enabling a Data-Driven Business, with the ability to collect and process data from right across the business in order to make better strategic and oeprational decisions.

Summary

In this lesson we introduced the concept of the Modern Data Stack.

We discussed the key characteristics of Modern Data Stack tools, and looked at specific vendors which are typically considered to sit within this category.

We considered the benefits of going down this route in terms of increased innovation, building a data driven business, and doing so with less cost and up-front capital investment than traditional approaches to data.

In the next lesson, we will deep dive into the architecture of Modern Data Stack deployments in more detail.

Prev LessonNext Lesson

© 2022 Timeflow Academy.