Tag: evaluation

  • Wired: McDonald’s AI Hiring Bot Exposed Millions of Applicants’ Data to Hackers Using the Password ‘123456’

    Source URL: https://www.wired.com/story/mcdonalds-ai-hiring-chat-bot-paradoxai/ Source: Wired Title: McDonald’s AI Hiring Bot Exposed Millions of Applicants’ Data to Hackers Using the Password ‘123456’ Feedly Summary: Basic security flaws left the personal info of tens of millions of McDonald’s job-seekers vulnerable on the “McHire” site built by AI software firm Paradox.ai. AI Summary and Description: Yes Summary: The…

  • The Register: Scholars sneaking phrases into papers to fool AI reviewers

    Source URL: https://www.theregister.com/2025/07/07/scholars_try_to_fool_llm_reviewers/ Source: The Register Title: Scholars sneaking phrases into papers to fool AI reviewers Feedly Summary: Using prompt injections to play a Jedi mind trick on LLMs A handful of international computer science researchers appear to be trying to influence AI reviews with a new class of prompt injection attack.… AI Summary and…

  • Slashdot: The Downside of a Digital Yes-Man

    Source URL: https://tech.slashdot.org/story/25/07/07/1923231/the-downside-of-a-digital-yes-man?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: The Downside of a Digital Yes-Man Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a study by Anthropic researchers on the impact of human feedback on AI behavior, particularly how it can lead to sycophantic responses from AI systems. This is particularly relevant for professionals in…

  • Slashdot: The Startup-Filled Coder ‘Village’ at the Heart of China’s AI Frenzy

    Source URL: https://slashdot.org/story/25/07/06/2045246/the-startup-filled-coder-village-at-the-heart-of-chinas-ai-frenzy?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: The Startup-Filled Coder ‘Village’ at the Heart of China’s AI Frenzy Feedly Summary: AI Summary and Description: Yes Summary: The text discusses China’s efforts to build an independent AI supply chain in response to U.S. technological dominance, highlighting the challenges faced by startups in the Liangzhu area. It underscores…

  • The Register: AI models just don’t understand what they’re talking about

    Source URL: https://www.theregister.com/2025/07/03/ai_models_potemkin_understanding/ Source: The Register Title: AI models just don’t understand what they’re talking about Feedly Summary: Researchers find models’ success at tests hides illusion of understanding Researchers from MIT, Harvard, and the University of Chicago have proposed the term “potemkin understanding" to describe a newly identified failure mode in large language models that…

  • Simon Willison’s Weblog: Frequently Asked Questions (And Answers) About AI Evals

    Source URL: https://simonwillison.net/2025/Jul/3/faqs-about-ai-evals/#atom-everything Source: Simon Willison’s Weblog Title: Frequently Asked Questions (And Answers) About AI Evals Feedly Summary: Frequently Asked Questions (And Answers) About AI Evals Hamel Husain and Shreya Shankar have been running a paid, cohort-based course on AI Evals For Engineers & PMs over the past few months. Here Hamel collects answers to…