Tag: generative

Source URL: https://simonwillison.net/2025/Feb/3/constitutional-classifiers/ Source: Simon Willison’s Weblog Title: Constitutional Classifiers: Defending against universal jailbreaks Feedly Summary: Constitutional Classifiers: Defending against universal jailbreaks Interesting new research from Anthropic, resulting in the paper Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming. From the paper: In particular, we introduce Constitutional Classifiers, a framework…

Hacker News: Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output

Feb 3, 2025

—

by

Source URL: https://github.com/klara-research/klarity Source: Hacker News Title: Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output Feedly Summary: Comments AI Summary and Description: Yes **Summary:** Klarity is a robust tool designed for analyzing uncertainty in generative model predictions. By leveraging both raw probability and semantic comprehension, it provides unique insights into model…

Slashdot: Mozilla Adapts ‘Fakespot’ Into an AI-Detecting Firefox Add-on

—

by

Source URL: https://news.slashdot.org/story/25/02/02/2156241/mozilla-adapts-fakespot-into-an-ai-detecting-firefox-add-on?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Mozilla Adapts ‘Fakespot’ Into an AI-Detecting Firefox Add-on Feedly Summary: AI Summary and Description: Yes Summary: Mozilla’s Fakespot Deepfake Detector is a free Firefox add-on that identifies whether online text is generated by AI or written by a human. This tool employs Mozilla’s proprietary engine and promises to enhance…

Simon Willison’s Weblog: OpenAI reasoning models: Advice on prompting

—

by

Source URL: https://simonwillison.net/2025/Feb/2/openai-reasoning-models-advice-on-prompting/ Source: Simon Willison’s Weblog Title: OpenAI reasoning models: Advice on prompting Feedly Summary: OpenAI reasoning models: Advice on prompting OpenAI’s documentation for their o1 and o3 “reasoning models" includes some interesting tips on how to best prompt them: Developer messages are the new system messages: Starting with o1-2024-12-17, reasoning models support developer…

Simon Willison’s Weblog: Quoting Benedict Evans

—

by

Source URL: https://simonwillison.net/2025/Feb/2/benedict-evans/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Benedict Evans Feedly Summary: Part of the concept of ‘Disruption’ is that important new technologies tend to be bad at the things that matter to the previous generation of technology, but they do something else important instead. Asking if an LLM can do very specific and…

Simon Willison’s Weblog: Quoting Sam Altman

—

by

Source URL: https://simonwillison.net/2025/Feb/2/sam-altman/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Sam Altman Feedly Summary: [In response to a question about releasing model weights] Yes, we are discussing. I personally think we have been on the wrong side of history here and need to figure out a different open source strategy; not everyone at OpenAI shares this…

Simon Willison’s Weblog: llm-anthropic

—

by

Source URL: https://simonwillison.net/2025/Feb/2/llm-anthropic/#atom-everything Source: Simon Willison’s Weblog Title: llm-anthropic Feedly Summary: llm-anthropic I’ve renamed my llm-claude-3 plugin to llm-anthropic, on the basis that Claude 4 will probably happen at some point so this is a better name for the plugin. If you’re a previous user of llm-claude-3 you can upgrade to the new plugin like…

Hacker News: Andrew Ng on DeepSeek

—

by

Source URL: https://www.deeplearning.ai/the-batch/issue-286/ Source: Hacker News Title: Andrew Ng on DeepSeek Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text outlines significant advancements and trends in the field of generative AI, particularly emphasizing China’s emergence as a competitor to the U.S. in this domain, the implications of open weight models, and the innovative…

Simon Willison’s Weblog: A professional workflow for translation using LLMs

—

by