Tag: system prompt
-
Hamel’s Blog: Selecting The Right AI Evals Tool
Source URL: https://hamel.dev/blog/posts/eval-tools/ Source: Hamel’s Blog Title: Selecting The Right AI Evals Tool Feedly Summary: Over the past year, I’ve focused heavily on AI Evals, both in my consulting work and teaching. A question I get constantly is, “What’s the best tool for evals?”. I’ve always resisted answering directly for two reasons. First, people focus…
-
Simon Willison’s Weblog: How to stop AI’s “lethal trifecta”
Source URL: https://simonwillison.net/2025/Sep/26/how-to-stop-ais-lethal-trifecta/ Source: Simon Willison’s Weblog Title: How to stop AI’s “lethal trifecta” Feedly Summary: How to stop AI’s “lethal trifecta” This is the second mention of the lethal trifecta in the Economist in just the last week! Their earlier coverage was Why AI systems may never be secure on September 22nd – I…
-
Simon Willison’s Weblog: Improved Gemini 2.5 Flash and Flash-Lite
Source URL: https://simonwillison.net/2025/Sep/25/improved-gemini-25-flash-and-flash-lite/#atom-everything Source: Simon Willison’s Weblog Title: Improved Gemini 2.5 Flash and Flash-Lite Feedly Summary: Improved Gemini 2.5 Flash and Flash-Lite Two new preview models from Google – updates to their fast and inexpensive Flash and Flash Lite families: The latest version of Gemini 2.5 Flash-Lite was trained and built based on three key…
-
Simon Willison’s Weblog: CompileBench: Can AI Compile 22-year-old Code?
Source URL: https://simonwillison.net/2025/Sep/22/compilebench/ Source: Simon Willison’s Weblog Title: CompileBench: Can AI Compile 22-year-old Code? Feedly Summary: CompileBench: Can AI Compile 22-year-old Code? Interesting new LLM benchmark from Piotr Grabowski and Piotr Migdał: how well can different models handle compilation challenges such as cross-compiling gucr for ARM64 architecture? This is one of my favorite applications of…
-
Simon Willison’s Weblog: I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now
Source URL: https://simonwillison.net/2025/Sep/18/agents/ Source: Simon Willison’s Weblog Title: I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now Feedly Summary: I’ve noticed something interesting over the past few weeks: I’ve started using the term “agent" in conversations where I don’t feel the need to then define it, roll…
-
Simon Willison’s Weblog: GPT‑5-Codex and upgrades to Codex
Source URL: https://simonwillison.net/2025/Sep/15/gpt-5-codex/#atom-everything Source: Simon Willison’s Weblog Title: GPT‑5-Codex and upgrades to Codex Feedly Summary: GPT‑5-Codex and upgrades to Codex OpenAI half-released a new model today: GPT‑5-Codex, a fine-tuned GPT-5 variant explicitly designed for their various AI-assisted programming tools. I say half-released because it’s not yet available via their API, but they “plan to make…
-
Bulletins: Vulnerability Summary for the Week of August 25, 2025
Source URL: https://www.cisa.gov/news-events/bulletins/sb25-245 Source: Bulletins Title: Vulnerability Summary for the Week of August 25, 2025 Feedly Summary: High Vulnerabilities PrimaryVendor — Product Description Published CVSS Score Source Info 1000projects–Online Project Report Submission and Evaluation System A vulnerability has been found in 1000projects Online Project Report Submission and Evaluation System 1.0. This issue affects some unknown…
-
Cloud Blog: vLLM Performance Tuning: The Ultimate Guide to xPU Inference Configuration
Source URL: https://cloud.google.com/blog/topics/developers-practitioners/vllm-performance-tuning-the-ultimate-guide-to-xpu-inference-configuration/ Source: Cloud Blog Title: vLLM Performance Tuning: The Ultimate Guide to xPU Inference Configuration Feedly Summary: Additional contributors include Hossein Sarshar, Ashish Narasimham, and Chenyang Li. Large Language Models (LLMs) are revolutionizing how we interact with technology, but serving these powerful models efficiently can be a challenge. vLLM has rapidly become…
-
Embrace The Red: Windsurf: Memory-Persistent Data Exfiltration (SpAIware Exploit)
Source URL: https://embracethered.com/blog/posts/2025/windsurf-spaiware-exploit-persistent-prompt-injection/ Source: Embrace The Red Title: Windsurf: Memory-Persistent Data Exfiltration (SpAIware Exploit) Feedly Summary: In this second post about Windsurf Cascade we are exploring the SpAIware attack, which allows memory persistent data exfiltration. SpAIware is an attack we first successfully demonstrated with ChatGPT last year and OpenAI mitigated. While inspecting the system prompt…
-
Simon Willison’s Weblog: too many model context protocol servers and LLM allocations on the dance floor
Source URL: https://simonwillison.net/2025/Aug/22/too-many-mcps/#atom-everything Source: Simon Willison’s Weblog Title: too many model context protocol servers and LLM allocations on the dance floor Feedly Summary: too many model context protocol servers and LLM allocations on the dance floor Useful reminder from Geoffrey Huntley of the infrequently discussed significant token cost of using MCP. Geoffrey estimate estimates that…