Tag: model security
-
Wired: This Tool Probes Frontier AI Models for Lapses in Intelligence
Source URL: https://www.wired.com/story/this-tool-probes-frontier-ai-models-for-lapses-in-intelligence/ Source: Wired Title: This Tool Probes Frontier AI Models for Lapses in Intelligence Feedly Summary: A new platform from data training company Scale AI will let artificial intelligence developers find their models’ weak spots. AI Summary and Description: Yes Summary: The text introduces a new platform by Scale AI designed to assist…
-
Cloud Blog: Enhance Gemini model security with content filters and system instructions
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/enhance-gemini-model-security-with-content-filters-and-system-instructions/ Source: Cloud Blog Title: Enhance Gemini model security with content filters and system instructions Feedly Summary: As organizations rush to adopt generative AI-driven chatbots and agents, it’s important to reduce the risk of exposure to threat actors who force AI models to create harmful content. We want to highlight two powerful capabilities…
-
Simon Willison’s Weblog: Trading Inference-Time Compute for Adversarial Robustness
Source URL: https://simonwillison.net/2025/Jan/22/trading-inference-time-compute/ Source: Simon Willison’s Weblog Title: Trading Inference-Time Compute for Adversarial Robustness Feedly Summary: Trading Inference-Time Compute for Adversarial Robustness Brand new research paper from OpenAI, exploring how inference-scaling “reasoning" models such as o1 might impact the search for improved security with respect to things like prompt injection. We conduct experiments on the…
-
Simon Willison’s Weblog: Quoting Jack Clark
Source URL: https://simonwillison.net/2024/Nov/18/jack-clark/ Source: Simon Willison’s Weblog Title: Quoting Jack Clark Feedly Summary: The main innovation here is just using more data. Specifically, Qwen2.5 Coder is a continuation of an earlier Qwen 2.5 model. The original Qwen 2.5 model was trained on 18 trillion tokens spread across a variety of languages and tasks (e.g, writing,…
-
Hacker News: AI Progress Stalls as OpenAI, Google and Anthropic Hit Roadblocks
Source URL: https://www.nasdaq.com/articles/ai-progress-stalls-openai-google-and-anthropic-hit-roadblocks Source: Hacker News Title: AI Progress Stalls as OpenAI, Google and Anthropic Hit Roadblocks Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the challenges faced by major AI companies such as OpenAI, Google, and Anthropic in their quest to develop more advanced AI models. It highlights setbacks related…
-
Cloud Blog: GenOps: learning from the world of microservices and traditional DevOps
Source URL: https://cloud.google.com/blog/products/devops-sre/genops-learnings-from-microservices-and-traditional-devops/ Source: Cloud Blog Title: GenOps: learning from the world of microservices and traditional DevOps Feedly Summary: Who is supposed to manage generative AI applications? While AI-related ownership often lands with data teams, we’re seeing requirements specific to generative AI applications that have distinct differences from those of a data and AI team,…