Tag: training method

  • OpenAI : Addendum to o3 and o4-mini system card: Codex

    Source URL: https://openai.com/index/o3-o4-mini-codex-system-card-addendum Source: OpenAI Title: Addendum to o3 and o4-mini system card: Codex Feedly Summary: Codex is a cloud-based coding agent. Codex is powered by codex-1, a version of OpenAI o3 optimized for software engineering. codex-1 was trained using reinforcement learning on real-world coding tasks in a variety of environments to generate code that…

  • Slashdot: Asking Chatbots For Short Answers Can Increase Hallucinations, Study Finds

    Source URL: https://slashdot.org/story/25/05/12/2114214/asking-chatbots-for-short-answers-can-increase-hallucinations-study-finds?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Asking Chatbots For Short Answers Can Increase Hallucinations, Study Finds Feedly Summary: AI Summary and Description: Yes Summary: The research from Giskard highlights a critical concern for AI professionals regarding the trade-off between response length and factual accuracy among leading AI models. This finding is particularly relevant for those…

  • Wired: These Startups Are Building Advanced AI Models Without Data Centers

    Source URL: https://www.wired.com/story/these-startups-are-building-advanced-ai-models-over-the-internet-with-untapped-data/ Source: Wired Title: These Startups Are Building Advanced AI Models Without Data Centers Feedly Summary: A new crowd-trained way to develop LLMs over the internet could shake up the AI industry with a giant 100 billion-parameter model later this year. AI Summary and Description: Yes Summary: The text discusses an innovative crowd-trained…

  • The Register: AI training license will allow LLM builders to pay for content they consume

    Source URL: https://www.theregister.com/2025/04/24/uk_publishing_body_launches_ai/ Source: The Register Title: AI training license will allow LLM builders to pay for content they consume Feedly Summary: UK org backing it promises ‘legal certainty’ for devs, money for creators… but is it too late? A UK non-profit is planning to introduce a new licensing model which will allow developers of…

  • Simon Willison’s Weblog: Quoting Andriy Burkov

    Source URL: https://simonwillison.net/2025/Apr/6/andriy-burkov/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Andriy Burkov Feedly Summary: […] The disappointing releases of both GPT-4.5 and Llama 4 have shown that if you don’t train a model to reason with reinforcement learning, increasing its size no longer provides benefits. Reinforcement learning is limited only to domains where a reward can…

  • Slashdot: OpenAI’s Motion to Dismiss Copyright Claims Rejected by Judge

    Source URL: https://news.slashdot.org/story/25/04/05/0323213/openais-motion-to-dismiss-copyright-claims-rejected-by-judge?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: OpenAI’s Motion to Dismiss Copyright Claims Rejected by Judge Feedly Summary: AI Summary and Description: Yes Summary: The ongoing lawsuit filed by The New York Times against OpenAI raises significant issues regarding copyright infringement related to AI training datasets. The case underscores the complex intersection of AI technology, copyright…

  • Hacker News: Tao: Using test-time compute to train efficient LLMs without labeled data

    Source URL: https://www.databricks.com/blog/tao-using-test-time-compute-train-efficient-llms-without-labeled-data Source: Hacker News Title: Tao: Using test-time compute to train efficient LLMs without labeled data Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces a new model tuning method for large language models (LLMs) called Test-time Adaptive Optimization (TAO) that enhances model quality without requiring large amounts of labeled…

  • Hacker News: IETF setting standards for AI preferences

    Source URL: https://www.ietf.org/blog/aipref-wg/ Source: Hacker News Title: IETF setting standards for AI preferences Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the formation of the AI Preferences (AIPREF) Working Group, aimed at standardizing how content preferences are expressed for AI model training, amid concerns from content publishers about unauthorized use. This…

  • Hacker News: Understanding R1-Zero-Like Training: A Critical Perspective

    Source URL: https://github.com/sail-sg/understand-r1-zero Source: Hacker News Title: Understanding R1-Zero-Like Training: A Critical Perspective Feedly Summary: Comments AI Summary and Description: Yes Summary: The text presents a novel approach to LLM training called R1-Zero-like training, emphasizing a new reinforcement learning method termed Dr. GRPO that enhances reasoning capabilities. It highlights significant improvements in model performance through…