Tag: prompt-engineering
-
Simon Willison’s Weblog: A comparison of ChatGPT/GPT-4o’s previous and current system prompts
Source URL: https://simonwillison.net/2025/Apr/29/chatgpt-sycophancy-prompt/ Source: Simon Willison’s Weblog Title: A comparison of ChatGPT/GPT-4o’s previous and current system prompts Feedly Summary: A comparison of ChatGPT/GPT-4o’s previous and current system prompts GPT-4o’s recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI’s Aidan McLaughlin: last night we rolled out our first…
-
Simon Willison’s Weblog: A comparison of ChatGPT/GPT-4o’s previous and current system prompts
Source URL: https://simonwillison.net/2025/Apr/29/a-comparison-of-chatgptgpt-4os-previous-and-current-system-promp/#atom-everything Source: Simon Willison’s Weblog Title: A comparison of ChatGPT/GPT-4o’s previous and current system prompts Feedly Summary: A comparison of ChatGPT/GPT-4o’s previous and current system prompts GPT-4o’s recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI’s Aidan McLaughlin: last night we rolled out our first…
-
Simon Willison’s Weblog: Exploring Promptfoo via Dave Guarino’s SNAP evals
Source URL: https://simonwillison.net/2025/Apr/24/exploring-promptfoo/#atom-everything Source: Simon Willison’s Weblog Title: Exploring Promptfoo via Dave Guarino’s SNAP evals Feedly Summary: I used part three (here’s parts one and two) of Dave Guarino’s series on evaluating how well LLMs can answer questions about SNAP (aka food stamps) as an excuse to explore Promptfoo, an LLM eval tool. SNAP (Supplemental…
-
Simon Willison’s Weblog: An LLM Query Understanding Service
Source URL: https://simonwillison.net/2025/Apr/9/an-llm-query-understanding-service/#atom-everything Source: Simon Willison’s Weblog Title: An LLM Query Understanding Service Feedly Summary: An LLM Query Understanding Service Doug Turnbull recently wrote about how all search is structured now: Many times, even a small open source LLM will be able to turn a search query into reasonable structure at relatively low cost. In…
-
Simon Willison’s Weblog: debug-gym
Source URL: https://simonwillison.net/2025/Mar/31/debug-gym/#atom-everything Source: Simon Willison’s Weblog Title: debug-gym Feedly Summary: debug-gym New paper and code from Microsoft Research that experiments with giving LLMs access to the Python debugger. They found that the best models could indeed improve their results by running pdb as a tool. They saw the best results overall from Claude 3.7…
-
Simon Willison’s Weblog: Function calling with Gemma
Source URL: https://simonwillison.net/2025/Mar/26/function-calling-with-gemma/#atom-everything Source: Simon Willison’s Weblog Title: Function calling with Gemma Feedly Summary: Function calling with Gemma Google’s Gemma 3 model (the 27B variant is particularly capable, I’ve been trying it out via Ollama) supports function calling exclusively through prompt engineering. The official documentation describes two recommended prompts – both of them suggest that…
-
Simon Willison’s Weblog: The "think" tool: Enabling Claude to stop and think in complex tool use situations
Source URL: https://simonwillison.net/2025/Mar/21/the-think-tool/#atom-everything Source: Simon Willison’s Weblog Title: The "think" tool: Enabling Claude to stop and think in complex tool use situations Feedly Summary: The “think" tool: Enabling Claude to stop and think in complex tool use situations Fascinating new prompt engineering trick from Anthropic. They use their standard tool calling mechanism to define a…
-
Simon Willison’s Weblog: How ProPublica Uses AI Responsibly in Its Investigations
Source URL: https://simonwillison.net/2025/Mar/14/propublica-ai/ Source: Simon Willison’s Weblog Title: How ProPublica Uses AI Responsibly in Its Investigations Feedly Summary: How ProPublica Uses AI Responsibly in Its Investigations Charles Ornstein describes how ProPublic used an LLM to help analyze data for their recent story A Study of Mint Plants. A Device to Stop Bleeding. This Is the…
-
Simon Willison’s Weblog: Xata Agent
Source URL: https://simonwillison.net/2025/Mar/13/xata-agent/ Source: Simon Willison’s Weblog Title: Xata Agent Feedly Summary: Xata Agent Xata are a hosted PostgreSQL company who also develop the open source pgroll and pgstream schema migration tools. Their new “Agent" tool is a system that helps monitor and optimize a PostgreSQL server using prompts to LLMs. Any time I see…
-
Hacker News: AI is blurring the line between PMs and Engineers
Source URL: https://humanloop.com/blog/ai-is-blurring-the-lines-between-pms-and-engineers Source: Hacker News Title: AI is blurring the line between PMs and Engineers Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses the emerging trend of prompt engineering in AI applications, emphasizing how it increasingly involves product managers (PMs) rather than just software engineers. This shift indicates a blurring…