Tomasz Tunguz: Semantic Cultivators : The Critical Future Role to Enable AI

Source URL: https://www.tomtunguz.com/semantic-layer/
Source: Tomasz Tunguz
Title: Semantic Cultivators : The Critical Future Role to Enable AI

Feedly Summary: By 2026, AI agents will consume 10x more enterprise data than humans, but with none of the contextual understanding that prevents catastrophic misinterpretations.

In this presentation I shared yesterday, this is the main argument.
Historically, our data pipelines have served people. We’ve architected complex pipelines to ingest, filter, and transform information in different systems of record: cloud data warehouses, security information and event management systems (SIEMs), and observability platforms.
We then interpreted these outputs and acted upon them.
But very quickly the end consumer won’t be people. So, we need to fundamentally reconsider the interface between these systems of record and their transformed data.
People thrive in ambiguity because we’re great at contextual interpretation. One VP of Sales mentions revenue, a CFO understands the demarcation between bookings, billings, GAAP revenue, or contracted ARR. Humans navigate these nuances effortlessly, machines don’t.
What happens when your AI agent pulls “customer acquisition cost” data but doesn’t recognize that marketing measures it by campaign spend, sales calculates it based on AE + BDR costs, & finance includes fully-loaded employee costs?
The result: expensive nonsense masquerading as intelligence.
To combat this disinformation, the teams that were formerly responsible for maintaining and monitoring pipelines will become cultivators of a constantly evolving collection of cross-domain semantic layers that feed the questions from AI agents via MCP or another protocol layer.
The major question in all this is how to deliver the semantic layer. Historically, it’s been difficult to sell a semantic layer as a standalone product. Looker was successful with its LookML language, and other companies have developed their own query language, which to some extent has enforced a loose semantic layer.
The coming years will see a major shift as enterprises realize that their most valuable digital asset isn’t their data lake or their AI models—it’s the semantic layer that makes those investments meaningful.
Software is the business of selling promotions, and no one has been promoted for implementing a semantic layer. However, many people will be promoted for massively improving the accuracy of AI systems and across data security and observability.
The semantic layer is the keystone to that project and consequently, the most strategic part of any data pipeline today.

AI Summary and Description: Yes

**Summary:** The text outlines a future trend in enterprise data management where AI agents will significantly increase their consumption of data without the contextual understanding necessary to interpret it accurately. It argues for the importance of establishing a robust semantic layer to enable AI systems to function effectively, navigating ambiguities that human interpreters manage effortlessly. The piece emphasizes the shift in focus from raw data to the semantic relationships that contextualize that data, marking a strategic evolution in data pipelines.

**Detailed Description:**

The provided text presents a compelling narrative about the impending transformation in how enterprise data is managed, particularly in the context of artificial intelligence (AI) usage.

– **Key Points:**
– By 2026, AI agents are expected to consume ten times more enterprise data than humans, lacking the contextual understanding necessary for accurate interpretation.
– Traditional data pipelines have historically catered to human interpretation, ingesting, filtering, and transforming data from multiple systems, including cloud data warehouses, SIEMs, and observability platforms.
– The text warns of the potential misinterpretations that can arise when AI agents, which cannot recognize contextual nuances, process data.
– Example: Confusion can occur when different departments have varying definitions and calculations for the same metrics, such as “customer acquisition cost.”
– As a solution, the text advocates for the development of a semantic layer that provides clarity and meaning to the data consumed by AI agents, allowing for more informed and accurate outputs.
– It mentions that while building a semantic layer has previously been perceived as difficult and not marketable, its importance in facilitating effective AI operation will drive demand in the coming years.
– It highlights the strategic significance of the semantic layer, proposing it as the key asset for enterprises finding value in their data systems rather than just their data lakes or AI models.

– **Implications for Professionals:**
– **AI Preparation:** As enterprises lean towards AI-driven data analysis, security and compliance professionals must prepare for the potential risks associated with misinterpretation by AI systems.
– **Focus on Governance:** Emphasizing the need to implement robust semantic layers could lead to improved data governance and oversight frameworks, ensuring that AI outputs remain reliable and secure.
– **Interdepartmental Collaboration:** Encouraging collaboration between departments to establish a common understanding of metrics and terms can help reduce ambiguity and improve data quality.
– **True Value of Data Assets:** Professionals should recognize that the semantic layer will become a core data asset, positioning themselves to manage and enhance this aspect of the data pipeline.

In summary, the strategic importance placed on semantic layers in data pipelines highlights a significant change in the landscape of data management, particularly for AI-related security and compliance, and emphasizes a need for professionals in these fields to adapt and innovate.