Tag: llm

Source URL: https://blog.cloudflare.com/guardrails-in-ai-gateway/ Source: The Cloudflare Blog Title: Keep AI interactions secure and risk-free with Guardrails in AI Gateway Feedly Summary: Deploy AI safely with built-in Guardrails in AI Gateway. Flag and block harmful or inappropriate content, protect personal data, and ensure compliance in real-time AI Summary and Description: Yes Short Summary with Insight: The…

Simon Willison’s Weblog: olmOCR

Feb 26, 2025

—

by

Source URL: https://simonwillison.net/2025/Feb/26/olmocr/#atom-everything Source: Simon Willison’s Weblog Title: olmOCR Feedly Summary: olmOCR New from Ai2 – olmOCR is “an open-source tool designed for high-throughput conversion of PDFs and other documents into plain text while preserving natural reading order". At its core is allenai/olmOCR-7B-0225-preview, a Qwen2-VL-7B-Instruct variant trained on ~250,000 pages of diverse PDF content (both…

Simon Willison’s Weblog: Quoting Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

—

by

Source URL: https://simonwillison.net/2025/Feb/25/emergent-misalignment/ Source: Simon Willison’s Weblog Title: Quoting Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs Feedly Summary: In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts…

Simon Willison’s Weblog: Deep research System Card

—

by

Source URL: https://simonwillison.net/2025/Feb/25/deep-research-system-card/#atom-everything Source: Simon Willison’s Weblog Title: Deep research System Card Feedly Summary: Deep research System Card OpenAI are rolling out their Deep research “agentic" research tool to their $20/month ChatGPT Plus users today, who get 10 queries a month. $200/month ChatGPT Pro gets 120 uses. Deep research is the best version of this…

Simon Willison’s Weblog: Gemini 2.0 Flash and Flash-Lite

—

by

Source URL: https://simonwillison.net/2025/Feb/25/gemini-20-flash-and-flash-lite/ Source: Simon Willison’s Weblog Title: Gemini 2.0 Flash and Flash-Lite Feedly Summary: Gemini 2.0 Flash and Flash-Lite Gemini 2.0 Flash-Lite is now generally available – previously it was available just as a preview – and has announced pricing. The model is $0.075/million input tokens and $0.030/million output – the same price as…

Hacker News: Narrow finetuning can produce broadly misaligned LLM [pdf]

—

by

Source URL: https://martins1612.github.io/emergent_misalignment_betley.pdf Source: Hacker News Title: Narrow finetuning can produce broadly misaligned LLM [pdf] Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The document presents findings on the phenomenon of “emergent misalignment” in large language models (LLMs) like GPT-4o when finetuned on specific narrow tasks, particularly the creation of insecure code. The results…

Hacker News: Hard problems that reduce to document ranking

—

by

Source URL: https://noperator.dev/posts/document-ranking-for-complex-problems/ Source: Hacker News Title: Hard problems that reduce to document ranking Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the innovative application of large language models (LLMs) in document ranking, particularly for locating vulnerabilities in code patches. It presents a novel approach to addressing complex security problems by…

Simon Willison’s Weblog: Leaked Windsurf prompt

—

by

Source URL: https://simonwillison.net/2025/Feb/25/leaked-windsurf-prompt/ Source: Simon Willison’s Weblog Title: Leaked Windsurf prompt Feedly Summary: Leaked Windsurf prompt The Windurf Editor is Codeium’s highly regarded entrant into the fork-of-VS-code AI-enhanced IDE model first pioneered by Cursor (and by VS Code itself). I heard online that it had a quirky system prompt, and was able to replicate that…

Hacker News: DOGE will use AI to assess the responses of federal workers

—

by