Scott Logic: Extracting Data From AI Models: A Tale of Three Approaches

Source URL: https://blog.scottlogic.com/2025/07/23/extracting-data-from-ai-models-a-tale-of-three-approaches.html
Source: Scott Logic
Title: Extracting Data From AI Models: A Tale of Three Approaches

Feedly Summary: After building a React application with three AI assistants, our developer discovered that extracting your conversation history afterwards is like trying to collect debts in a frontier town: ChatGPT eventually pays up after some serious negotiation, Claude charms you while keeping the vault locked, and Copilot confidently hands you a treasure map to gold buried on someone else’s land. The lesson? These AI partners can help you build impressive applications but somehow can’t easily tell you what you discussed last Tuesday, so document as you go or risk spending more time archaeological than architectural.

AI Summary and Description: Yes

**Summary:** The text provides a comprehensive account of the challenges faced by the author while trying to extract data from previous interactions with various AI models (ChatGPT, Claude, and Copilot) during a React application development project. The author emphasizes the importance of proactive data management in AI-assisted projects to avoid complications later on. This analysis offers valuable insights into the nuances of data extraction and portability in AI tools, which is crucial for professionals in the fields of AI, cloud computing, and data governance.

**Detailed Description:**
The blog post presents the author’s journey in extracting and analyzing conversation data from three AI models during the development of a React application. The author’s experiences reveal significant differences in data extraction capabilities and usability among the models, categorized as “The Good” (ChatGPT), “The Bad” (Claude), and “The Ugly” (Copilot). Here are the major points discussed:

– **ChatGPT (The Good)**:
– Successfully allowed for the extraction of conversation data through a downloadable JSON file.
– Initial challenges included the JSON file format being minified and difficult to parse.
– After collaboration with ChatGPT and trial-and-error debugging, the author developed a workable Python script that could extract structured, human-readable output.

– **Claude (The Bad)**:
– Lacked any direct data extraction capabilities, creating a frustrating experience for the author.
– Initially suggested manual logging of conversations before revealing a complicated API-based solution that ultimately did not support extracting previous chat histories.
– The author resorted to building a custom logging system because the basic data portability feature was absent.

– **Copilot (The Ugly)**:
– Initially appeared to offer an accessible API for data interaction, but required enterprise-level permissions that the author did not have.
– The limitations enforced by corporate governance and security policies made individual data extraction highly problematic.
– The discrepancies between different Microsoft product offerings (Microsoft 365 Copilot vs. Copilot Chat) and API documentation led to confusion, further complicating the author’s data access efforts.

**Key Insights**:
– Proactive data management is critical when working with AI models. Developers should assume that extracting historical data after the fact may not be feasible and should implement logging mechanisms early on in the project.
– The differences in API functionality across AI models highlight the need for improved transparency regarding data portability and access controls.
– The blog illustrates broader themes in AI development, including the importance of API documentation accuracy, the necessity for user-centered design in data management tools, and the impact of corporate policies on individual developers.

Overall, this text is an essential read for professionals working in AI, software development, and data governance, as it sheds light on the practical implications of using AI technologies in real-world applications. The experiences shared could serve as a guide and a cautionary tale for others embarking on similar projects.