Tag: Claude

Source URL: https://simonwillison.net/2025/Feb/5/o3-mini-documentation/#atom-everything Source: Simon Willison’s Weblog Title: o3-mini is really good at writing internal documentation Feedly Summary: o3-mini is really good at writing internal documentation I wanted to refresh my knowledge of how the Datasette permissions system works today. I already have extensive hand-written documentation for that, but I thought it would be interesting…

Slashdot: Air Force Documents On Gen AI Test Are Just Whole Pages of Redactions

—

by

Source URL: https://tech.slashdot.org/story/25/02/03/2018259/air-force-documents-on-gen-ai-test-are-just-whole-pages-of-redactions?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Air Force Documents On Gen AI Test Are Just Whole Pages of Redactions Feedly Summary: AI Summary and Description: Yes Summary: The text discusses the Air Force Research Laboratory’s (AFRL) funding of generative AI services through a contract with Ask Sage. It highlights concerns over transparency due to extensive…

Slashdot: Anthropic Makes ‘Jailbreak’ Advance To Stop AI Models Producing Harmful Results

—

by

Source URL: https://slashdot.org/story/25/02/03/1810255/anthropic-makes-jailbreak-advance-to-stop-ai-models-producing-harmful-results?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Anthropic Makes ‘Jailbreak’ Advance To Stop AI Models Producing Harmful Results Feedly Summary: AI Summary and Description: Yes Summary: Anthropic has introduced a new technique called “constitutional classifiers” designed to enhance the security of large language models (LLMs) like its Claude chatbot. This system aims to mitigate risks associated…

Simon Willison’s Weblog: Constitutional Classifiers: Defending against universal jailbreaks

—

by

Source URL: https://simonwillison.net/2025/Feb/3/constitutional-classifiers/ Source: Simon Willison’s Weblog Title: Constitutional Classifiers: Defending against universal jailbreaks Feedly Summary: Constitutional Classifiers: Defending against universal jailbreaks Interesting new research from Anthropic, resulting in the paper Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming. From the paper: In particular, we introduce Constitutional Classifiers, a framework…

Hacker News: Constitutional Classifiers: Defending against universal jailbreaks

—

by

Source URL: https://www.anthropic.com/research/constitutional-classifiers Source: Hacker News Title: Constitutional Classifiers: Defending against universal jailbreaks Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a novel approach by the Anthropic Safeguards Research Team to defend AI models against jailbreaks through the use of Constitutional Classifiers. This system demonstrates robustness against various jailbreak techniques while…

Simon Willison’s Weblog: OpenAI reasoning models: Advice on prompting

Feb 2, 2025

—

by

Source URL: https://simonwillison.net/2025/Feb/2/openai-reasoning-models-advice-on-prompting/ Source: Simon Willison’s Weblog Title: OpenAI reasoning models: Advice on prompting Feedly Summary: OpenAI reasoning models: Advice on prompting OpenAI’s documentation for their o1 and o3 “reasoning models" includes some interesting tips on how to best prompt them: Developer messages are the new system messages: Starting with o1-2024-12-17, reasoning models support developer…

Simon Willison’s Weblog: llm-anthropic

Feb 2, 2025

—

by

Source URL: https://simonwillison.net/2025/Feb/2/llm-anthropic/#atom-everything Source: Simon Willison’s Weblog Title: llm-anthropic Feedly Summary: llm-anthropic I’ve renamed my llm-claude-3 plugin to llm-anthropic, on the basis that Claude 4 will probably happen at some point so this is a better name for the plugin. If you’re a previous user of llm-claude-3 you can upgrade to the new plugin like…

Simon Willison’s Weblog: A professional workflow for translation using LLMs

Feb 2, 2025

—

by

Source URL: https://simonwillison.net/2025/Feb/2/workflow-for-translation/#atom-everything Source: Simon Willison’s Weblog Title: A professional workflow for translation using LLMs Feedly Summary: A professional workflow for translation using LLMs Tom Gally is a professional translator who has been exploring the use of LLMs since the release of GPT-4. In this Hacker News comment he shares a detailed workflow for how…

Simon Willison’s Weblog: On DeepSeek and Export Controls

Jan 29, 2025

—

by