jailbreaks – Page 2 – Experimental News Clipping Site

OpenAI : Operator System Card

Jan 23, 2025

—

by

Source URL: https://openai.com/index/operator-system-card Source: OpenAI Title: Operator System Card Feedly Summary: Drawing from OpenAI’s established safety frameworks, this document highlights our multi-layered approach, including model and product mitigations we’ve implemented to protect against prompt engineering and jailbreaks, protect privacy and security, as well as details our external red teaming efforts, safety evaluations, and ongoing work…

Hacker News: Human study on AI spear phishing campaigns

Jan 5, 2025

—

by

system automation

in Uncategorized

Source URL: https://www.lesswrong.com/posts/GCHyDKfPXa5qsG2cP/human-study-on-ai-spear-phishing-campaigns Source: Hacker News Title: Human study on AI spear phishing campaigns Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a study evaluating the effectiveness of AI models in executing personalized phishing attacks, revealing a disturbing increase in the capabilities of AI-generated spear phishing. The findings indicate high click-through…

Hacker News: Garak, LLM Vulnerability Scanner

Nov 17, 2024

—

by

system automation

in Uncategorized

Source URL: https://github.com/NVIDIA/garak Source: Hacker News Title: Garak, LLM Vulnerability Scanner Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text describes “garak,” a command-line vulnerability scanner specifically designed for large language models (LLMs). This tool aims to uncover various weaknesses in LLMs, such as hallucination, prompt injection attacks, and data leakage. Its development…

Cloud Blog: Arize, Vertex AI API: Evaluation workflows to accelerate generative app development and AI ROI

Oct 31, 2024

—

by

system automation

in Uncategorized

Source URL: https://cloud.google.com/blog/topics/partners/benefits-of-arize-ai-in-tandem-with-vertex-ai-api-for-gemini/ Source: Cloud Blog Title: Arize, Vertex AI API: Evaluation workflows to accelerate generative app development and AI ROI Feedly Summary: In the rapidly evolving landscape of artificial intelligence, enterprise AI engineering teams must constantly seek cutting-edge solutions to drive innovation, enhance productivity, and maintain a competitive edge. In leveraging an AI observability…

Wired: This Prompt Can Make an AI Chatbot Identify and Extract Personal Details From Your Chats

Oct 17, 2024

—

by

system automation

in Uncategorized

Source URL: https://www.wired.com/story/ai-imprompter-malware-llm/ Source: Wired Title: This Prompt Can Make an AI Chatbot Identify and Extract Personal Details From Your Chats Feedly Summary: Security researchers created an algorithm that turns a malicious prompt into a set of hidden instructions that could send a user’s personal information to an attacker. AI Summary and Description: Yes Summary:…

Hacker News: Show HN: Arch – an intelligent prompt gateway built on Envoy

Oct 15, 2024

—

by

system automation

in Uncategorized

Source URL: https://github.com/katanemo/arch Source: Hacker News Title: Show HN: Arch – an intelligent prompt gateway built on Envoy Feedly Summary: Comments AI Summary and Description: Yes Summary: This text introduces “Arch,” an intelligent Layer 7 gateway designed specifically for managing LLM applications and enhancing the security, observability, and efficiency of generative AI interactions. Arch provides…

Slashdot: LLM Attacks Take Just 42 Seconds On Average, 20% of Jailbreaks Succeed

Oct 13, 2024

—

by

system automation

in Uncategorized

Source URL: https://it.slashdot.org/story/24/10/12/213247/llm-attacks-take-just-42-seconds-on-average-20-of-jailbreaks-succeed?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: LLM Attacks Take Just 42 Seconds On Average, 20% of Jailbreaks Succeed Feedly Summary: AI Summary and Description: Yes Summary: The article discusses alarming findings from Pillar Security’s report on attacks against large language models (LLMs), revealing that such attacks are not only alarmingly quick but also frequently result…

Tag: jailbreaks

OpenAI : Operator System Card

Hacker News: Human study on AI spear phishing campaigns

Hacker News: Garak, LLM Vulnerability Scanner

Cloud Blog: Arize, Vertex AI API: Evaluation workflows to accelerate generative app development and AI ROI

Wired: This Prompt Can Make an AI Chatbot Identify and Extract Personal Details From Your Chats

Hacker News: Show HN: Arch – an intelligent prompt gateway built on Envoy

Slashdot: LLM Attacks Take Just 42 Seconds On Average, 20% of Jailbreaks Succeed