Tag: behavior monitoring
-
Simon Willison’s Weblog: Quoting OpenAI o1 System Card
Source URL: https://simonwillison.net/2024/Dec/5/openai-o1-system-card/#atom-everything Source: Simon Willison’s Weblog Title: Quoting OpenAI o1 System Card Feedly Summary: When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ‘oversight mechanism’ in 5% of the time. Exfiltration attempts: When o1 found…