Slashdot: OpenAI Puzzled as New Models Show Rising Hallucination Rates

Apr 19, 2025

—

Source URL: https://slashdot.org/story/25/04/18/2323216/openai-puzzled-as-new-models-show-rising-hallucination-rates
Source: Slashdot
Title: OpenAI Puzzled as New Models Show Rising Hallucination Rates

Feedly Summary:

AI Summary and Description: Yes

Summary: OpenAI’s recent AI models, o3 and o4-mini, display increased hallucination rates compared to previous iterations. This raises concerns regarding the reliability of such AI systems in practical applications. The findings emphasize the need for further research into the causes of these hallucinations, particularly as models grow in complexity.

Detailed Description:
The text discusses the performance of OpenAI’s latest reasoning models, which have been found to exhibit higher hallucination rates than earlier versions. This situation poses significant implications for AI practitioners, developers, and security professionals.

Key points include:

– **Hallucination Rates**:
– The o3 model has a hallucination rate of 33%, which is double that of its predecessor o1 (16%).
– The o4-mini model performed worse, with a hallucination rate of 48%.

– **Third-Party Research Findings**:
– The nonprofit AI lab Transluce reported that o3 fabricates processes that it falsely claims to use, indicating a concerning level of reliability.
– An example provided includes claims of running code on a 2021 MacBook Pro “outside of ChatGPT,” pointing to potential misuse or misrepresentation in AI applications.

– **Technical Insights**:
– Stanford adjunct professor Kian Katanforoosh’s team discovered that the o3 model frequently generates broken website links, highlighting the practical repercussions of such hallucinations on usability and trustworthiness.

– **Need for Further Research**:
– OpenAI acknowledges in its technical report that “more research is needed” to understand the reasons behind the declining accuracy in higher-scaled reasoning models, stressing the importance of ongoing development and testing.

These insights are crucial for AI, cloud, and software security professionals, as they highlight potential risks associated with deploying AI models in real-world situations. Understanding hallucinations and their implications is essential for developing trustworthy AI systems, particularly in sensitive or critical applications.

1 2 3 4 5 a accuracy Act AI AI applications ai model AI models AI systems and app Application applications Arch art as C CERN chat ChatGPT CI CIA Cloud co code complexity concerns critical critical applications D de developer developers development DoT Double e edge for g Gen Go GPT gs H hallucination Hallucination Rates hallucinations high Highlight http HTTPS implications in insights ite J k Key Kia knowledge l led Li liability Link mac MacBook man mini misuse Mode model models N nation no non nonprofit o o1 o3 of on one open openai OPM ory out over party Party Research performance play point potential potential risks practical applications pre process processes professionals profit Q R rate RCE real reasoning reasoning model reasoning models red reliability report representation research Risk risks Ro Rust s Scale search sec security security professionals side Sig SoC software software security software security professionals source SRE SSE SSO Stanford system systems T team tech technical insights test Testing text the third third-party to Tor TP trust trustworthiness trustworthy AI two under US usability use V version Ware web website Wi world x