Scott Logic: Extracting Data From AI Models: A Tale of Three Approaches

Jul 25, 2025

—

Source URL: https://blog.scottlogic.com/2025/07/23/extracting-data-from-ai-models-a-tale-of-three-approaches.html
Source: Scott Logic
Title: Extracting Data From AI Models: A Tale of Three Approaches

Feedly Summary: After building a React application with three AI assistants, our developer discovered that extracting your conversation history afterwards is like trying to collect debts in a frontier town: ChatGPT eventually pays up after some serious negotiation, Claude charms you while keeping the vault locked, and Copilot confidently hands you a treasure map to gold buried on someone else’s land. The lesson? These AI partners can help you build impressive applications but somehow can’t easily tell you what you discussed last Tuesday, so document as you go or risk spending more time archaeological than architectural.

AI Summary and Description: Yes

**Summary:** The text provides a comprehensive account of the challenges faced by the author while trying to extract data from previous interactions with various AI models (ChatGPT, Claude, and Copilot) during a React application development project. The author emphasizes the importance of proactive data management in AI-assisted projects to avoid complications later on. This analysis offers valuable insights into the nuances of data extraction and portability in AI tools, which is crucial for professionals in the fields of AI, cloud computing, and data governance.

**Detailed Description:**
The blog post presents the author’s journey in extracting and analyzing conversation data from three AI models during the development of a React application. The author’s experiences reveal significant differences in data extraction capabilities and usability among the models, categorized as “The Good” (ChatGPT), “The Bad” (Claude), and “The Ugly” (Copilot). Here are the major points discussed:

– **ChatGPT (The Good)**:
– Successfully allowed for the extraction of conversation data through a downloadable JSON file.
– Initial challenges included the JSON file format being minified and difficult to parse.
– After collaboration with ChatGPT and trial-and-error debugging, the author developed a workable Python script that could extract structured, human-readable output.

– **Claude (The Bad)**:
– Lacked any direct data extraction capabilities, creating a frustrating experience for the author.
– Initially suggested manual logging of conversations before revealing a complicated API-based solution that ultimately did not support extracting previous chat histories.
– The author resorted to building a custom logging system because the basic data portability feature was absent.

– **Copilot (The Ugly)**:
– Initially appeared to offer an accessible API for data interaction, but required enterprise-level permissions that the author did not have.
– The limitations enforced by corporate governance and security policies made individual data extraction highly problematic.
– The discrepancies between different Microsoft product offerings (Microsoft 365 Copilot vs. Copilot Chat) and API documentation led to confusion, further complicating the author’s data access efforts.

**Key Insights**:
– Proactive data management is critical when working with AI models. Developers should assume that extracting historical data after the fact may not be feasible and should implement logging mechanisms early on in the project.
– The differences in API functionality across AI models highlight the need for improved transparency regarding data portability and access controls.
– The blog illustrates broader themes in AI development, including the importance of API documentation accuracy, the necessity for user-centered design in data management tools, and the impact of corporate policies on individual developers.

Overall, this text is an essential read for professionals working in AI, software development, and data governance, as it sheds light on the practical implications of using AI technologies in real-world applications. The experiences shared could serve as a guide and a cautionary tale for others embarking on similar projects.

2 2025 3 5 7 a access access control access controls account accuracy Act actions after AI AI assistants AI development ai model AI models AI technologies AI tool AI tools analysis and API app Application application development applications Arch architectural ARM art as assistant assistants assisted at ated based being Bi Bug building by C capabilities caution centered Centered Design challenge challenges chat ChatGPT CI CIA Claude Cloud cloud computing co Col collaboration Computing control controls conversation Copilot corporate governance corporate policies CoT critical cross custom logging D data data access data extraction data governance data management data portability day de debt Debugging design developer developers development document documentation dual e end enterprise EoL ERP error event exp experience extraction face fact feature file for front full function functionality g Go governance GPT gs H hands harm high Highlight historical data HR http HTTPS human implications in insights inter interaction interactions io ite J json k keeping Key l Labor land led level Li limitations logging logging mechanisms logic low M made man management management tools Micro Microsoft Microsoft 365 Microsoft 365 Copilot Mila mini mission ML Mode model models N negotiation no o of off on one opilot OPM ory oS other out output over partners pay per permissions pilot point policies portability post practical implications pre pro proactive problem product professionals project projects ps Py Python Python script Q R rate RCE re react React application real real-world applications red Risk Ro RSA Rust s SD sec security security policies SHA Sig Sim size sizes software software development source SSE SSO structured support system T tech technologies ted text the Time to tool tools Tor town TP transparency trial UI up US usability use user uth V val Vault Ware Wi world world application world applications x yt z