A pair weeks in the past, dbt Labs made a giant splash at their yearly convention by saying the brand new dbt Semantic Layer. This was a giant deal, spawning excited tweets, in-depth thinkpieces, and celebration from companions like us.
The time period “semantic layer” (often known as a “metrics layer”) has been round for many years. dbt didn’t invent the idea, nor the phrase, although their model is definitely price taking note of.
“However Austin, what’s a semantic layer then?” So glad you requested.
On this article, I’ll break down what a semantic layer is in easy phrases and why it is best to care about dbt’s Semantic Layer.
What’s a semantic layer, and the place did it come from?
Semantic layer is a really literal time period – it’s the “layer” in a knowledge structure that makes use of “semantics” (phrases) that the enterprise person will truly perceive. Typically it’s referred to as the “enterprise layer” or the “metrics layer”.
As a substitute of uncooked tables with column names like “A000_CUST_ID_PROD”, knowledge groups construct a semantic layer and rename that column “Buyer”. Semantic layers assist to cover advanced code from enterprise customers. This code can get fairly advanced as knowledge groups attempt to seize the enterprise logic for key metrics, dimensions, and schemas.
So the place did this concept come from? Again within the day (I’m speaking in regards to the ‘90s and early 2000s), we had fairly fundamental knowledge tech. It was very sluggish and really arduous to make use of should you didn’t have a deep IT background.
Massive corporations like IBM, SAP, and Oracle constructed Enterprise Intelligence (BI) instruments like Cognos, Enterprise Objects, and Oracle BI, which might take smaller chunks of information from a clunky knowledge warehouse and let IT folks construct these semantic layers for enterprise customers. Primarily, they had been extra human-readable knowledge layers for enterprise customers.
The problem with early semantic layers
This business-friendly layer seems like a “good to have” enchancment, but it surely was actually a necessity as a result of attempting to run even a fundamental report throughout a complete knowledge warehouse might take hours and even days. (Sure, days.)
Enter the primary drawback: old-school semantic layers took wayyyyy too lengthy to construct, since folks trusted IT to arrange and modify them. To make issues worse, they had been cumbersome to keep up since enterprise wants had been all the time altering.
The enterprise customers’ resolution… export to Excel!
Enter fancy new BI instruments like Tableau, Qlik, and Energy BI. The idea was that if we empower the enterprise customers to “self-serve” with low-code or no-code BI instruments, the IT bottleneck will go away and analytics will formally be democratized! At the very least, that was the concept.
Enter the second drawback: we deserted the semantic layer idea for years, in favor of agility.
In contrast to outdated IT instruments, extra personas might purchase and use these new BI instruments. As a substitute of 1 BI device utilizing 1 semantic layer, constructed by 1 group from 1 knowledge warehouse, we had a number of BI instruments, being utilized by every kind of groups with no actual semantic layer.
Simply image this situation, which most likely appears all too actual to most knowledge folks. I carry my Tableau dashboard to a gathering, another person brings their Excel workbook, and another person brings a Energy BI dashboard. All of us then present totally different numbers for “whole income final quarter”. Uh oh!
After years of alternately ignoring and chasing the self-service BI dream, this matter blew up within the knowledge world once more. (We even flagged this as one of many six massive concepts from 2022 in our Way forward for the Trendy Information Stack report.)
This began in January, when Base Case proposed “Headless Enterprise Intelligence”, a brand new strategy to fixing issues with enterprise metrics and phrases. A pair months later, Benn Stancil talked in regards to the “lacking metrics layer” in at present’s knowledge stack.
That’s when issues actually took off. Airbnb introduced that it had been constructing a home-grown metrics platform referred to as Minerva to unravel this difficulty. Different distinguished tech corporations quickly adopted go well with, together with LinkedIn, Uber, and Spotify. Then dbt opened a PR hinting at a metrics or semantics layer, which included hyperlinks to these foundational blogs by Benn and Base Case.
This was such a sizzling matter that one in every of our Nice Information Debates was all in regards to the metrics layer, with a fiery dialogue between Drew Banin from dbt Labs and Nick Handel from Remodel.
The end result has been a giant open query within the knowledge and analytics world — how can we carry again all the good issues that IT beloved about semantic layers (consistency, clear governance, and trusted dependable knowledge) with out compromising the agility that analysts and enterprise customers demand?
Now lower than two years after this debate kicked off, evidently the way forward for the semantic layer has lastly develop into a actuality.
The dbt Semantic Layer
Enter dbt Labs and its new Semantic Layer!
The dbt Semantic Layer is the interface between your knowledge and your analyses: A platform for compiling and accessing dbt belongings in downstream instruments.
Information practitioners can outline metrics of their dbt tasks, then knowledge shoppers can question constantly outlined metrics in downstream instruments.
Cameron Afzal, Product Supervisor for the dbt Semantic Layer
The core idea behind dbt’s Semantic Layer is: outline issues as soon as, use them wherever.
Why does that make folks completely happy? This brings the idea of a semantic layer and its common metrics into dbt’s transformation layer. As dbt Labs put it, “Information practitioners can outline metrics of their dbt tasks, then knowledge shoppers can question constantly outlined metrics in downstream instruments.”
Information groups can construct these fashions and metrics in dbt, after which tie them into their different developer instruments like model management and launch administration with the Semantic Layer.
No matter what BI device they use, analysts and enterprise customers can then seize knowledge and go into that assembly, assured that their reply would be the identical as a result of they pulled the metric from a centralized place.
Be taught extra and get began with the dbt Semantic Layer right here.
dbt + Atlan
The dbt Semantic Layer is nice in its personal proper, however what makes it much more thrilling is the way it ties in with key instruments throughout the fashionable knowledge stack… and we’re one in every of them!
Alongside the dbt keynote, we introduced our partnership with dbt Labs and our integration with the Semantic Layer. With this, joint clients can have entry to an end-to-end governance framework for knowledge fashions and metrics within the fashionable knowledge stack.
The dbt Semantic Layer created a typical solution to outline metrics throughout your transformations and fashions. Now our integration brings these wealthy metrics into the remainder of the information stack.
With this integration, dbt metrics and fashions are first-class belongings in Atlan. Which means that they’re searchable and discoverable by means of our platform and a part of auto-generated, column-level lineage, identical to any Snowflake desk, Fivetran pipeline, or Looker dashboard.
Our native dbt Cloud integration ingests all dbt metrics and metadata about dbt fashions, merges it with metadata from all different instruments within the knowledge stack, creates column-level lineage from supply to BI, and sends that unified context again into instruments like Snowflake and the BI instruments the place folks work each day.
With highly effective impression and root trigger evaluation, fashionable knowledge groups lastly have the instruments they want for end-to-end knowledge governance and alter administration at each stage of the information lifecycle.
Be taught extra about our integration with the dbt Semantic Layer right here.