red team – Page 7 – Experimental News Clipping Site

Slashdot: TSA’s Airport Facial-Recognition Tech Faces Audit Probe

Feb 4, 2025

—

by

Source URL: https://yro.slashdot.org/story/25/02/03/2353253/tsas-airport-facial-recognition-tech-faces-audit-probe?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: TSA’s Airport Facial-Recognition Tech Faces Audit Probe Feedly Summary: AI Summary and Description: Yes Summary: The Department of Homeland Security’s Inspector General is conducting an audit on the TSA’s facial recognition technology due to concerns raised by lawmakers and privacy advocates, focusing on its efficacy in enhancing security while…

Simon Willison’s Weblog: Constitutional Classifiers: Defending against universal jailbreaks

Feb 3, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/Feb/3/constitutional-classifiers/ Source: Simon Willison’s Weblog Title: Constitutional Classifiers: Defending against universal jailbreaks Feedly Summary: Constitutional Classifiers: Defending against universal jailbreaks Interesting new research from Anthropic, resulting in the paper Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming. From the paper: In particular, we introduce Constitutional Classifiers, a framework…

Hacker News: Constitutional Classifiers: Defending against universal jailbreaks

Feb 3, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.anthropic.com/research/constitutional-classifiers Source: Hacker News Title: Constitutional Classifiers: Defending against universal jailbreaks Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a novel approach by the Anthropic Safeguards Research Team to defend AI models against jailbreaks through the use of Constitutional Classifiers. This system demonstrates robustness against various jailbreak techniques while…

OpenAI : OpenAI o3-mini System Card

Jan 31, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://openai.com/index/o3-mini-system-card Source: OpenAI Title: OpenAI o3-mini System Card Feedly Summary: This report outlines the safety work carried out for the OpenAI o3-mini model, including safety evaluations, external red teaming, and Preparedness Framework evaluations. AI Summary and Description: Yes Summary: The text discusses safety work related to the OpenAI o3-mini model, emphasizing safety evaluations…

CSA: How Can CISOs Ensure Safe AI Adoption?

Jan 30, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://normalyze.ai/blog/unlocking-the-value-of-safe-ai-adoption-insights-for-security-practitioners/ Source: CSA Title: How Can CISOs Ensure Safe AI Adoption? Feedly Summary: AI Summary and Description: Yes Summary: The text discusses critical strategies for security practitioners, particularly CISOs, to safely adopt AI technologies within organizations. It emphasizes the need for visibility, education, balanced policies, and proactive threat modeling to ensure both innovation…

Slashdot: Microsoft Makes DeepSeek’s R1 Model Available On Azure AI and GitHub

Jan 29, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://slashdot.org/story/25/01/29/2218253/microsoft-makes-deepseeks-r1-model-available-on-azure-ai-and-github?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Microsoft Makes DeepSeek’s R1 Model Available On Azure AI and GitHub Feedly Summary: AI Summary and Description: Yes Summary: Microsoft has enhanced its Azure AI Foundry platform by integrating DeepSeek’s R1 model, facilitating efficient experimentation and deployment of AI applications for developers. The model has passed extensive security evaluations,…

Hacker News: DeepSeek R1 Is Now Available on Azure AI Foundry and GitHub

Jan 29, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://azure.microsoft.com/en-us/blog/deepseek-r1-is-now-available-on-azure-ai-foundry-and-github/ Source: Hacker News Title: DeepSeek R1 Is Now Available on Azure AI Foundry and GitHub Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses the availability of DeepSeek R1 in the Azure AI Foundry model catalog, emphasizing the model’s integration into a trusted and scalable platform for businesses. It…

Cloud Blog: Adversarial Misuse of Generative AI

Jan 29, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://cloud.google.com/blog/topics/threat-intelligence/adversarial-misuse-generative-ai/ Source: Cloud Blog Title: Adversarial Misuse of Generative AI Feedly Summary: Rapid advancements in artificial intelligence (AI) are unlocking new possibilities for the way we work and accelerating innovation in science, technology, and beyond. In cybersecurity, AI is poised to transform digital defense, empowering defenders and enhancing our collective security. Large language…

OpenAI : Operator System Card

Jan 23, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://openai.com/index/operator-system-card Source: OpenAI Title: Operator System Card Feedly Summary: Drawing from OpenAI’s established safety frameworks, this document highlights our multi-layered approach, including model and product mitigations we’ve implemented to protect against prompt engineering and jailbreaks, protect privacy and security, as well as details our external red teaming efforts, safety evaluations, and ongoing work…

Simon Willison’s Weblog: Lessons From Red Teaming 100 Generative AI Products

Jan 18, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/Jan/18/lessons-from-red-teaming/ Source: Simon Willison’s Weblog Title: Lessons From Red Teaming 100 Generative AI Products Feedly Summary: Lessons From Red Teaming 100 Generative AI Products New paper from Microsoft describing their top eight lessons learned red teaming (deliberately seeking security vulnerabilities in) 100 different generative AI models and products over the past few years.…

Tag: red team