Tomasz Tunguz: Hidden Technical Debt in AI

Jul 17, 2025

—

Source URL: https://www.tomtunguz.com/hidden-technical-debt-in-ai/
Source: Tomasz Tunguz
Title: Hidden Technical Debt in AI

Feedly Summary: That little black box in the middle is machine learning code.

I remember reading Google’s 2015 Hidden Technical Debt in ML paper & thinking how little of a machine learning application was actual machine learning.
The vast majority was infrastructure, data management, & operational complexity.
With the dawn of AI, it seemed large language models would subsume these boxes. The promise was simplicity : drop in an LLM & watch it handle everything from customer service to code generation. No more complex pipelines or brittle integrations.
But in building internal applications, we’ve observed a similar dynamic with AI.

Agents need lots of context, like a human : how is the CRM structured, what do we enter into each field – but input is expensive the Hungry, Hungry AI model.
Reducing cost means writing deterministic software to replace the reasoning of AI.
For example, automating email management means writing tools to create Asana tasks & update the CRM.
As the number of tools increases beyond ten or fifteen tools, tool calling no longer works. Time to spin up a classical machine learning model to select tools.
Then there’s watching the system with observability, evaluating whether it’s performant, & routing to the right model. In addition, there’s a whole category of software around making sure the AI does what it’s supposed to.
Guardrails prevent inappropriate responses. Rate limiting stops costs from spiraling out of control when a system goes haywire.
Information retrieval (RAG – retrieval augmented generation) is essential for any production system. In my email app, I use a LanceDB vector database to find all emails from a particular sender & match their tone.
There are other techniques for knowledge management around graph RAG & specialized vector databases.
More recently, memory has become much more important. The command line interfaces for AI tools save conversation history as markdown files.
When I publish charts, I want the Theory Ventures caption at the bottom right, a particular font, colors, & styles. Those are now all saved within .gemini or .claude files in a series of cascading directories.
The original simplicity of large language models has been subsumed by enterprise-grade production complexity.
This isn’t identical to the previous generation of machine learning systems, but it follows a clear parallel. What appeared to be a simple “AI magic box” turns out to be an iceberg, with most of the engineering work hidden beneath the surface.

AI Summary and Description: Yes

Summary: The text discusses the complexities involved in deploying AI and machine learning systems, particularly in the context of large language models (LLMs). It highlights the infrastructure and data management efforts that often overshadow the actual machine learning components, revealing that integrating AI into enterprise applications is far from simple.

Detailed Description: The passage brings to light the intricate realities of incorporating machine learning and AI technologies into organizational operations. While the initial appeal of LLMs was their potential to streamline processes and reduce complexity, the experience of building internal applications suggests otherwise. Key insights include:

– **Initial Promises vs. Reality**: The initial simplicity offered by LLMs is contrasted with the operational and infrastructural demands that accompany them.

– **Infrastructure and Management**: A significant portion of machine learning applications involves infrastructure management and operational complexity rather than pure ML code. This aspect has not diminished with the adoption of AI.

– **Context and Input Costs**: LLMs require extensive context to function effectively, leading to increased input costs. This necessitates deterministic software solutions to manage tasks like email automation, thus complicating the AI deployment landscape.

– **Integration Challenges**: As organizations introduce multiple tools, challenges arise in ensuring effective communication and seamless operations across these tools, leading to a potential need for classical machine learning models.

– **Observability and Monitoring**: Continuous monitoring is critical to ensuring that these systems perform accurately and efficiently. Techniques like rate limiting and guardrails are necessary to prevent excessive costs and mishandled outputs.

– **Knowledge Management Techniques**: New methodologies, such as retrieval augmented generation (RAG), and the use of specialized vector databases, are essential for handling information effectively in production settings.

– **Memory and Data Management**: The importance of memory in AI systems has grown, necessitating the preservation of conversation histories and formatting requirements in various file types.

– **Complexity of Enterprise Deployment**: The text emphasizes that while LLMs seem like a “magic box,” they involve complex engineering efforts beneath the surface, akin to an iceberg where most of the work is hidden.

This analysis underscores critical factors for security and compliance professionals, particularly in ensuring that AI deployments are well-architected, monitored, and compliant with data governance standards. The complexities of integrating AI systems also highlight the importance of robust security measures to protect sensitive data and adhere to compliance regulations.

01 1 2 5 a Act adoption adoption of AI agent agents AGI AI ai model AI systems AI technologies AI tool AI tools analysis and app Application applications Arch architected art as asana at Augment augmented generation Auto automation beyond Bi black Box building by C calling challenge challenges CI CIA class Claude CleaR co code code generation Col command command line command line interfaces communication complexity compliance compliance professionals compliance regulations Context continuous continuous monitoring control conversation core cost Costs critical CRM cross Customer customer service D data data governance data management database databases de debt demand deployment deployments deterministic deterministic software e edge effective efficient email end Engineer engineering enterprise enterprise applications enterprise deployment ERP event exp experience face fact file for function g Gemini Gen generation Go Google governance governance standards grade graph gs Guardrails H handling high Highlight http HTTPS human Iceberg implicit in information information retrieval infrastructure infrastructure management insights integration integration challenges integrations inter interface Interfaces intern io ite J k Key knowledge knowledge management l Lance land language language model language models large large language model large language models Large Language Models (LLMs) leading learning led Li limiting line interface llm llms lm long low M mac machine Machine Learning machine learning applications machine learning model machine learning models Magic making man management markdown matt mean measures memory methodologies mid middle Mila mini ML Mode model models Monitor monitoring multi my N new NIST no NPU o observability oE of off on one operation operational complexity operations ops opt organization organizations ory oS other out output Outputs over paper Parallel per Pipeline pipelines potential pre pro process processes product production professionals ps Q R rag rate rate limiting RCE reading real reality reasoning red Regulation regulations Requirements response responses retrieval Retrieval Augmented Generation (RAG) right Ro robust security RoT routing row RSA s sec security security and compliance security measure security measures sensitive data series service settings SHA Sig Sim Simple simplicity size software software solutions solutions source specialized SSE standards structured system systems T Task tasks tech technical debt techniques technologies ted text the thinking Time to tool tool calling tools Tor TP trie turn type UI under up update US use V val vector database vector databases WAN Ware Well Wi writing writing tools x yt z