Tag: safeguards
-
OpenAI : Working with US CAISI and UK AISI to build more secure AI systems
Source URL: https://openai.com/index/us-caisi-uk-aisi-ai-update Source: OpenAI Title: Working with US CAISI and UK AISI to build more secure AI systems Feedly Summary: OpenAI shares progress on the partnership with the US CAISI and UK AISI to strengthen AI safety and security. The collaboration is setting new standards for responsible frontier AI deployment through joint red-teaming, biosecurity…
-
OpenAI : Working with US CAISI and UK AISI to build more secure AI systems
Source URL: https://openai.com/index/us-caisi-uk-aisi-ai-safety Source: OpenAI Title: Working with US CAISI and UK AISI to build more secure AI systems Feedly Summary: OpenAI shares progress on the partnership with the US CAISI and UK AISI to strengthen AI safety and security. The collaboration is setting new standards for responsible frontier AI deployment through joint red-teaming, biosecurity…
-
Wired: Psychological Tricks Can Get AI to Break the Rules
Source URL: https://arstechnica.com/science/2025/09/these-psychological-tricks-can-get-llms-to-respond-to-forbidden-prompts/ Source: Wired Title: Psychological Tricks Can Get AI to Break the Rules Feedly Summary: Researchers convinced large language model chatbots to comply with “forbidden” requests using a variety of conversational tactics. AI Summary and Description: Yes Summary: The text discusses researchers’ exploration of conversational tactics used to manipulate large language model (LLM)…
-
The Register: OpenAI reorg at risk as Attorneys General push AI safety
Source URL: https://www.theregister.com/2025/09/05/openai_reorg_at_risk/ Source: The Register Title: OpenAI reorg at risk as Attorneys General push AI safety Feedly Summary: California, Delaware AGs blast ChatGPT shop over chatbot safeguards The Attorneys General of California and Delaware on Friday wrote to OpenAI’s board of directors, demanding that the AI company take steps to ensure its services are…
-
New York Times – Artificial Intelligence : ChatGPT Will Get Parental Controls and New Safety Features, OpenAI Says
Source URL: https://www.nytimes.com/2025/09/02/technology/personaltech/chatgpt-parental-controls-openai.html Source: New York Times – Artificial Intelligence Title: ChatGPT Will Get Parental Controls and New Safety Features, OpenAI Says Feedly Summary: After a California teenager spent months on ChatGPT discussing plans to end his life, OpenAI said it would introduce parental controls and better responses for users in distress. AI Summary and…
-
NCSC Feed: From bugs to bypasses: adapting vulnerability disclosure for AI safeguards
Source URL: https://www.ncsc.gov.uk/blog-post/from-bugs-to-bypasses-adapting-vulnerability-disclosure-for-ai-safeguards Source: NCSC Feed Title: From bugs to bypasses: adapting vulnerability disclosure for AI safeguards Feedly Summary: Exploring how far cyber security approaches can help mitigate risks in generative AI systems AI Summary and Description: Yes Summary: The text addresses the intersection of cybersecurity strategies and generative AI systems, highlighting how established cybersecurity…