Tag: GPT-4o
-
Simon Willison’s Weblog: A comparison of ChatGPT/GPT-4o’s previous and current system prompts
Source URL: https://simonwillison.net/2025/Apr/29/chatgpt-sycophancy-prompt/ Source: Simon Willison’s Weblog Title: A comparison of ChatGPT/GPT-4o’s previous and current system prompts Feedly Summary: A comparison of ChatGPT/GPT-4o’s previous and current system prompts GPT-4o’s recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI’s Aidan McLaughlin: last night we rolled out our first…
-
Simon Willison’s Weblog: A comparison of ChatGPT/GPT-4o’s previous and current system prompts
Source URL: https://simonwillison.net/2025/Apr/29/a-comparison-of-chatgptgpt-4os-previous-and-current-system-promp/#atom-everything Source: Simon Willison’s Weblog Title: A comparison of ChatGPT/GPT-4o’s previous and current system prompts Feedly Summary: A comparison of ChatGPT/GPT-4o’s previous and current system prompts GPT-4o’s recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI’s Aidan McLaughlin: last night we rolled out our first…
-
Simon Willison’s Weblog: OpenAI: Introducing our latest image generation model in the API
Source URL: https://simonwillison.net/2025/Apr/24/openai-images-api/ Source: Simon Willison’s Weblog Title: OpenAI: Introducing our latest image generation model in the API Feedly Summary: OpenAI: Introducing our latest image generation model in the API The astonishing native image generation capability of GPT-4o – a feature which continues to not have an obvious name – is now available via OpenAI’s…
-
Simon Willison’s Weblog: Exploring Promptfoo via Dave Guarino’s SNAP evals
Source URL: https://simonwillison.net/2025/Apr/24/exploring-promptfoo/#atom-everything Source: Simon Willison’s Weblog Title: Exploring Promptfoo via Dave Guarino’s SNAP evals Feedly Summary: I used part three (here’s parts one and two) of Dave Guarino’s series on evaluating how well LLMs can answer questions about SNAP (aka food stamps) as an excuse to explore Promptfoo, an LLM eval tool. SNAP (Supplemental…
-
Simon Willison’s Weblog: OpenAI o3 and o4-mini System Card
Source URL: https://simonwillison.net/2025/Apr/21/openai-o3-and-o4-mini-system-card/ Source: Simon Willison’s Weblog Title: OpenAI o3 and o4-mini System Card Feedly Summary: OpenAI o3 and o4-mini System Card I’m surprised to see a combined System Card for o3 and o4-mini in the same document – I’d expect to see these covered separately. The opening paragraph calls out the most interesting new…
-
Simon Willison’s Weblog: GPT-4.1: Three new million token input models from OpenAI, including their cheapest model yet
Source URL: https://simonwillison.net/2025/Apr/14/gpt-4-1/ Source: Simon Willison’s Weblog Title: GPT-4.1: Three new million token input models from OpenAI, including their cheapest model yet Feedly Summary: OpenAI introduced three new models this morning: GPT-4.1, GPT-4.1 mini and GPT-4.1 nano. These are API-only models right now, not available through the ChatGPT interface (though you can try them out…
-
Slashdot: OpenAI Unveils Coding-Focused GPT-4.1 While Phasing Out GPT-4.5
Source URL: https://slashdot.org/story/25/04/14/1726250/openai-unveils-coding-focused-gpt-41-while-phasing-out-gpt-45 Source: Slashdot Title: OpenAI Unveils Coding-Focused GPT-4.1 While Phasing Out GPT-4.5 Feedly Summary: AI Summary and Description: Yes Summary: OpenAI’s launch of the GPT-4.1 model family emphasizes enhanced coding capabilities and instruction adherence. The new models expand token context significantly and introduce a tiered pricing strategy, offering a more cost-effective alternative while…
-
Slashdot: After Meta Cheating Allegations, ‘Unmodified’ Llama 4 Maverick Model Tested – Ranks #32
Source URL: https://tech.slashdot.org/story/25/04/13/2226203/after-meta-cheating-allegations-unmodified-llama-4-maverick-model-tested—ranks-32?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: After Meta Cheating Allegations, ‘Unmodified’ Llama 4 Maverick Model Tested – Ranks #32 Feedly Summary: AI Summary and Description: Yes Summary: The text discusses claims made by Meta about its Maverick AI model’s performance compared to leading models like GPT-4o and Gemini Flash 2, alongside criticisms regarding the reliability…
-
Cloud Blog: Introducing Ironwood TPUs and new innovations in AI Hypercomputer
Source URL: https://cloud.google.com/blog/products/compute/whats-new-with-ai-hypercomputer/ Source: Cloud Blog Title: Introducing Ironwood TPUs and new innovations in AI Hypercomputer Feedly Summary: Today’s innovation isn’t born in a lab or at a drafting board; it’s built on the bedrock of AI infrastructure. AI workloads have new and unique demands — addressing these requires a finely crafted combination of hardware…
-
Slashdot: Meta Got Caught Gaming AI Benchmarks
Source URL: https://tech.slashdot.org/story/25/04/08/133257/meta-got-caught-gaming-ai-benchmarks?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Meta Got Caught Gaming AI Benchmarks Feedly Summary: AI Summary and Description: Yes Summary: Meta’s release of the Llama 4 models, Scout and Maverick, has stirred the competitive landscape of AI. Maverick’s claims of superiority over established models like GPT-4o and Gemini 2.0 Flash raise questions about evaluation fairness,…