Tag: manipulation
-
Slashdot: OpenAI Is Scanning Users’ ChatGPT Conversations and Reporting Content To Police
Source URL: https://yro.slashdot.org/story/25/08/31/2311231/openai-is-scanning-users-chatgpt-conversations-and-reporting-content-to-police Source: Slashdot Title: OpenAI Is Scanning Users’ ChatGPT Conversations and Reporting Content To Police Feedly Summary: AI Summary and Description: Yes Summary: The text highlights OpenAI’s controversial practice of monitoring user conversations in ChatGPT for threats, revealing significant security and privacy implications. This admission raises questions about the balance between safety and…
-
The Register: ChatGPT hates LA Chargers fans
Source URL: https://www.theregister.com/2025/08/27/chatgpt_has_a_problem_with/ Source: The Register Title: ChatGPT hates LA Chargers fans Feedly Summary: Harvard researchers find model guardrails tailor query responses to user’s inferred politics and other affiliations OpenAI’s ChatGPT appears to be more likely to refuse to respond to questions posed by fans of the Los Angeles Chargers football team than to followers…
-
Slashdot: One Long Sentence is All It Takes To Make LLMs Misbehave
Source URL: https://slashdot.org/story/25/08/27/1756253/one-long-sentence-is-all-it-takes-to-make-llms-misbehave?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: One Long Sentence is All It Takes To Make LLMs Misbehave Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a significant security research finding from Palo Alto Networks’ Unit 42 regarding vulnerabilities in large language models (LLMs). The researchers explored methods that allow users to bypass…
-
Slashdot: Google Improves Gemini AI Image Editing With ‘Nano Banana’ Model
Source URL: https://slashdot.org/story/25/08/26/215246/google-improves-gemini-ai-image-editing-with-nano-banana-model?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google Improves Gemini AI Image Editing With ‘Nano Banana’ Model Feedly Summary: AI Summary and Description: Yes Summary: Google DeepMind has launched the “nano banana” model (Gemini 2.5 Flash Image), which excels in AI image editing by offering improved consistency in edits. This advancement enhances the practical use cases…
-
The Register: One long sentence is all it takes to make LLMs misbehave
Source URL: https://www.theregister.com/2025/08/26/breaking_llms_for_fun/ Source: The Register Title: One long sentence is all it takes to make LLMs misbehave Feedly Summary: Chatbots ignore their guardrails when your grammar sucks, researchers find Security researchers from Palo Alto Networks’ Unit 42 have discovered the key to getting large language model (LLM) chatbots to ignore their guardrails, and it’s…
-
Cloud Blog: Intelligent code conversion: Databricks Spark SQL to BigQuery SQL via Gemini
Source URL: https://cloud.google.com/blog/products/data-analytics/automate-sql-translation-databricks-to-bigquery-with-gemini/ Source: Cloud Blog Title: Intelligent code conversion: Databricks Spark SQL to BigQuery SQL via Gemini Feedly Summary: As data platforms evolve and businesses diversify their cloud ecosystems, the need to migrate SQL workloads between engines is becoming increasingly common. Recently, I had the opportunity to work on translating a set of Databricks…
-
Unit 42: Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety
Source URL: https://unit42.paloaltonetworks.com/logit-gap-steering-impact/ Source: Unit 42 Title: Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety Feedly Summary: New research from Unit 42 on logit-gap steering reveals how internal alignment measures can be bypassed, making external AI security vital. The post Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety appeared…