Tag: manipulation

  • Slashdot: OpenAI Is Scanning Users’ ChatGPT Conversations and Reporting Content To Police

    Source URL: https://yro.slashdot.org/story/25/08/31/2311231/openai-is-scanning-users-chatgpt-conversations-and-reporting-content-to-police Source: Slashdot Title: OpenAI Is Scanning Users’ ChatGPT Conversations and Reporting Content To Police Feedly Summary: AI Summary and Description: Yes Summary: The text highlights OpenAI’s controversial practice of monitoring user conversations in ChatGPT for threats, revealing significant security and privacy implications. This admission raises questions about the balance between safety and…

  • The Register: ChatGPT hates LA Chargers fans

    Source URL: https://www.theregister.com/2025/08/27/chatgpt_has_a_problem_with/ Source: The Register Title: ChatGPT hates LA Chargers fans Feedly Summary: Harvard researchers find model guardrails tailor query responses to user’s inferred politics and other affiliations OpenAI’s ChatGPT appears to be more likely to refuse to respond to questions posed by fans of the Los Angeles Chargers football team than to followers…

  • Cisco Talos Blog: Libbiosig, Tenda, SAIL, PDF XChange, Foxit vulnerabilities

    Source URL: https://blog.talosintelligence.com/libbiosig-tenda-sail-pdf-xchange-foxit-vulnerabilities/ Source: Cisco Talos Blog Title: Libbiosig, Tenda, SAIL, PDF XChange, Foxit vulnerabilities Feedly Summary: Cisco Talos’ Vulnerability Discovery & Research team recently disclosed ten vulnerabilities in BioSig Libbiosig, nine in Tenda AC6 Router, eight in SAIL, two in PDF-XChange Editor, and one in a Foxit PDF Reader.The vulnerabilities mentioned in this blog…

  • Slashdot: One Long Sentence is All It Takes To Make LLMs Misbehave

    Source URL: https://slashdot.org/story/25/08/27/1756253/one-long-sentence-is-all-it-takes-to-make-llms-misbehave?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: One Long Sentence is All It Takes To Make LLMs Misbehave Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a significant security research finding from Palo Alto Networks’ Unit 42 regarding vulnerabilities in large language models (LLMs). The researchers explored methods that allow users to bypass…

  • Simon Willison’s Weblog: Quoting Bruce Schneier

    Source URL: https://simonwillison.net/2025/Aug/27/bruce-schneier/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Bruce Schneier Feedly Summary: We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks. Any AI that is working in an adversarial environment—and by this I mean that it may encounter untrusted training data or…

  • Slashdot: Google Improves Gemini AI Image Editing With ‘Nano Banana’ Model

    Source URL: https://slashdot.org/story/25/08/26/215246/google-improves-gemini-ai-image-editing-with-nano-banana-model?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google Improves Gemini AI Image Editing With ‘Nano Banana’ Model Feedly Summary: AI Summary and Description: Yes Summary: Google DeepMind has launched the “nano banana” model (Gemini 2.5 Flash Image), which excels in AI image editing by offering improved consistency in edits. This advancement enhances the practical use cases…

  • The Register: One long sentence is all it takes to make LLMs misbehave

    Source URL: https://www.theregister.com/2025/08/26/breaking_llms_for_fun/ Source: The Register Title: One long sentence is all it takes to make LLMs misbehave Feedly Summary: Chatbots ignore their guardrails when your grammar sucks, researchers find Security researchers from Palo Alto Networks’ Unit 42 have discovered the key to getting large language model (LLM) chatbots to ignore their guardrails, and it’s…

  • The Register: Honey, I shrunk the image and now I’m pwned

    Source URL: https://www.theregister.com/2025/08/21/google_gemini_image_scaling_attack/ Source: The Register Title: Honey, I shrunk the image and now I’m pwned Feedly Summary: Google’s Gemini-powered tools tripped up by image-scaling prompt injection Security researchers with Trail of Bits have found that Google Gemini CLI and other production AI systems can be deceived by image scaling attacks, a well-known adversarial challenge…

  • Cloud Blog: Intelligent code conversion: Databricks Spark SQL to BigQuery SQL via Gemini

    Source URL: https://cloud.google.com/blog/products/data-analytics/automate-sql-translation-databricks-to-bigquery-with-gemini/ Source: Cloud Blog Title: Intelligent code conversion: Databricks Spark SQL to BigQuery SQL via Gemini Feedly Summary: As data platforms evolve and businesses diversify their cloud ecosystems, the need to migrate SQL workloads between engines is becoming increasingly common. Recently, I had the opportunity to work on translating a set of Databricks…

  • Unit 42: Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety

    Source URL: https://unit42.paloaltonetworks.com/logit-gap-steering-impact/ Source: Unit 42 Title: Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety Feedly Summary: New research from Unit 42 on logit-gap steering reveals how internal alignment measures can be bypassed, making external AI security vital. The post Logit-Gap Steering: A New Frontier in Understanding and Probing LLM Safety appeared…