Tag: safety risks

  • Slashdot: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data

    Source URL: https://slashdot.org/story/25/08/17/0331217/llm-found-transmitting-behavioral-traits-to-student-llm-via-hidden-signals-in-data?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: LLM Found Transmitting Behavioral Traits to ‘Student’ LLM Via Hidden Signals in Data Feedly Summary: AI Summary and Description: Yes Summary: The study highlights a concerning phenomenon in AI development known as subliminal learning, where a “teacher” model instills traits in a “student” model without explicit instruction. This can…

  • Wired: X Data Center Fire in Oregon Started Inside Power Cabinet, Authorities Say

    Source URL: https://www.wired.com/story/x-data-center-fire-in-oregon-started-inside-power-cabinet-authorities-say/ Source: Wired Title: X Data Center Fire in Oregon Started Inside Power Cabinet, Authorities Say Feedly Summary: Generative AI has put data centers under the spotlight, and surging electricity needs could increase risk of fires. AI Summary and Description: Yes Summary: The surge in data center electricity needs due to generative AI…

  • Simon Willison’s Weblog: Expanding on what we missed with sycophancy

    Source URL: https://simonwillison.net/2025/May/2/what-we-missed-with-sycophancy/ Source: Simon Willison’s Weblog Title: Expanding on what we missed with sycophancy Feedly Summary: Expanding on what we missed with sycophancy I criticized OpenAI’s initial post about their recent ChatGPT sycophancy rollback as being “relatively thin" so I’m delighted that they have followed it with a much more in-depth explanation of what…

  • Hacker News: Infosec 101 for Activists

    Source URL: https://infosecforactivists.org Source: Hacker News Title: Infosec 101 for Activists Feedly Summary: Comments AI Summary and Description: Yes Summary: This document provides critical guidance on digital safety and information security for activists, highlighting the vulnerabilities that arise in modern technology and the specific risks faced by those protesting against power structures. It emphasizes cautious…

  • Alerts: CISA Releases Fact Sheet Detailing Embedded Backdoor Function of Contec CMS8000 Firmware

    Source URL: https://www.cisa.gov/news-events/alerts/2025/01/30/cisa-releases-fact-sheet-detailing-embedded-backdoor-function-contec-cms8000-firmware Source: Alerts Title: CISA Releases Fact Sheet Detailing Embedded Backdoor Function of Contec CMS8000 Firmware Feedly Summary: CISA released a fact sheet, Contec CMS8000 Contains a Backdoor, detailing an analysis of three firmware package versions of the Contec CMS8000, a patient monitor used by the U.S. Healthcare and Public Health (HPH) sector.…

  • METR updates – METR: Comment on NIST RMF GenAI Companion

    Source URL: https://downloads.regulations.gov/NIST-2024-0001-0075/attachment_2.pdf Source: METR updates – METR Title: Comment on NIST RMF GenAI Companion Feedly Summary: AI Summary and Description: Yes **Summary**: The provided text discusses the National Institute of Standards and Technology’s (NIST) AI Risk Management Framework concerning Generative AI. It outlines significant risks posed by autonomous AI systems and suggests enhancements to…

  • The Register: Columbus, Ohio, confirms 500K people affected by Rhysida ransomware attack

    Source URL: https://www.theregister.com/2024/11/04/columbus_rhysida_ransomware/ Source: The Register Title: Columbus, Ohio, confirms 500K people affected by Rhysida ransomware attack Feedly Summary: Victims were placed in serious danger following highly sensitive data dump The City of Columbus, Ohio, has confirmed half a million people’s data was accessed and potentially stolen when Rhysida’s ransomware raided its systems over the…

  • The Register: Anthropic’s latest Claude model can interact with computers – what could go wrong?

    Source URL: https://www.theregister.com/2024/10/24/anthropic_claude_model_can_use_computers/ Source: The Register Title: Anthropic’s latest Claude model can interact with computers – what could go wrong? Feedly Summary: For starters, it could launch a prompt injection attack on itself… The latest version of AI startup Anthropic’s Claude 3.5 Sonnet model can use computers – and the developer makes it sound like…