researchers – Page 7 – Experimental News Clipping Site

The Register: LegalPwn: Tricking LLMs by burying badness in lawyerly fine print

Sep 1, 2025

—

by

Source URL: https://www.theregister.com/2025/09/01/legalpwn_ai_jailbreak/ Source: The Register Title: LegalPwn: Tricking LLMs by burying badness in lawyerly fine print Feedly Summary: Trust and believe – AI models trained to see ‘legal’ doc as super legit Researchers at security firm Pangea have discovered yet another way to trivially trick large language models (LLMs) into ignoring their guardrails. Stick…

The Register: ChatGPT hates LA Chargers fans

Aug 28, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/08/27/chatgpt_has_a_problem_with/ Source: The Register Title: ChatGPT hates LA Chargers fans Feedly Summary: Harvard researchers find model guardrails tailor query responses to user’s inferred politics and other affiliations OpenAI’s ChatGPT appears to be more likely to refuse to respond to questions posed by fans of the Los Angeles Chargers football team than to followers…

Slashdot: One Long Sentence is All It Takes To Make LLMs Misbehave

Aug 27, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://slashdot.org/story/25/08/27/1756253/one-long-sentence-is-all-it-takes-to-make-llms-misbehave?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: One Long Sentence is All It Takes To Make LLMs Misbehave Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a significant security research finding from Palo Alto Networks’ Unit 42 regarding vulnerabilities in large language models (LLMs). The researchers explored methods that allow users to bypass…

OpenAI : OpenAI and Anthropic share findings from a joint safety evaluation

Aug 27, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://openai.com/index/openai-anthropic-safety-evaluation Source: OpenAI Title: OpenAI and Anthropic share findings from a joint safety evaluation Feedly Summary: OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration. AI Summary and Description: Yes Summary:…

The Register: First AI-powered ransomware spotted, but it’s not active – yet

Aug 26, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/08/26/first_aipowered_ransomware_spotted_by/ Source: The Register Title: First AI-powered ransomware spotted, but it’s not active – yet Feedly Summary: Oh, look, a use case for OpenAI’s gpt-oss-20b model ESET malware researchers Anton Cherepanov and Peter Strycek have discovered what they describe as the “first known AI-powered ransomware," which they named PromptLock. … AI Summary and Description:…

Schneier on Security: Encryption Backdoor in Military/Police Radios

Aug 26, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.schneier.com/blog/archives/2025/08/encryption-backdoor-in-military-police-radios.html Source: Schneier on Security Title: Encryption Backdoor in Military/Police Radios Feedly Summary: I wrote about this in 2023. Here’s the story: Three Dutch security analysts discovered the vulnerabilities—five in total—in a European radio standard called TETRA (Terrestrial Trunked Radio), which is used in radios made by Motorola, Damm, Hytera, and others. The…

The Register: One long sentence is all it takes to make LLMs misbehave

Aug 26, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/08/26/breaking_llms_for_fun/ Source: The Register Title: One long sentence is all it takes to make LLMs misbehave Feedly Summary: Chatbots ignore their guardrails when your grammar sucks, researchers find Security researchers from Palo Alto Networks’ Unit 42 have discovered the key to getting large language model (LLM) chatbots to ignore their guardrails, and it’s…

Slashdot: Perplexity’s AI Browser Comet Vulnerable To Prompt Injection Attacks That Hijack User Accounts

Aug 25, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://it.slashdot.org/story/25/08/25/1654220/perplexitys-ai-browser-comet-vulnerable-to-prompt-injection-attacks-that-hijack-user-accounts?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Perplexity’s AI Browser Comet Vulnerable To Prompt Injection Attacks That Hijack User Accounts Feedly Summary: AI Summary and Description: Yes Summary: The text highlights significant vulnerabilities in Perplexity’s Comet browser linked to its AI summarization functionalities. These vulnerabilities allow attackers to hijack user accounts and execute malicious commands, posing…

Slashdot: FBI Warns Russian Hackers Targeted ‘Thousands’ of Critical US Infrastructure IT Systems

Aug 24, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://news.slashdot.org/story/25/08/24/0638238/fbi-warns-russian-hackers-targeted-thousands-of-critical-us-infrastructure-it-systems?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: FBI Warns Russian Hackers Targeted ‘Thousands’ of Critical US Infrastructure IT Systems Feedly Summary: AI Summary and Description: Yes Summary: The text outlines a significant security threat posed by Russian state-sponsored hackers targeting U.S. critical infrastructure through vulnerabilities in Cisco devices. The report emphasizes the risks posed by unpatched…

The Register: Tinker with LLMs in the privacy of your own home using Llama.cpp

Aug 24, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://www.theregister.com/2025/08/24/llama_cpp_hands_on/ Source: The Register Title: Tinker with LLMs in the privacy of your own home using Llama.cpp Feedly Summary: Everything you need to know to build, run, serve, optimize and quantize models on your PC Hands on Training large language models (LLMs) may require millions or even billion of dollars of infrastructure, but…

Tag: researchers