Tag: reasoning process

  • Hacker News: Bolt: Bootstrap Long Chain-of-Thought in LLMs Without Distillation [pdf]

    Source URL: https://arxiv.org/abs/2502.03860 Source: Hacker News Title: Bolt: Bootstrap Long Chain-of-Thought in LLMs Without Distillation [pdf] Feedly Summary: Comments AI Summary and Description: Yes Summary: The paper introduces BOLT, a method designed to enhance the reasoning capabilities of large language models (LLMs) by generating long chains of thought (LongCoT) without relying on knowledge distillation. The…

  • Hacker News: Experience the DeepSeek R1 Distilled ‘Reasoning’ Models on Ryzen AI and Radeon

    Source URL: https://community.amd.com/t5/ai/experience-the-deepseek-r1-distilled-reasoning-models-on-amd/ba-p/740593 Source: Hacker News Title: Experience the DeepSeek R1 Distilled ‘Reasoning’ Models on Ryzen AI and Radeon Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the DeepSeek R1 model, a newly developed reasoning model in the realm of large language models (LLMs). It highlights its unique ability to perform…

  • Hacker News: Mini-R1: Reproduce DeepSeek R1 "Aha Moment"

    Source URL: https://www.philschmid.de/mini-deepseek-r1 Source: Hacker News Title: Mini-R1: Reproduce DeepSeek R1 "Aha Moment" Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the release of DeepSeek R1, an open model for complex reasoning tasks that utilizes reinforcement learning algorithms, specifically Group Relative Policy Optimization (GRPO). It offers insight into the model’s training…

  • Hacker News: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL

    Source URL: https://arxiv.org/abs/2501.12948 Source: Hacker News Title: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the introduction of new language models, DeepSeek-R1 and DeepSeek-R1-Zero, developed to enhance reasoning capabilities in large language models (LLMs) through reinforcement learning. This research represents a significant advancement…

  • Slashdot: Cutting-Edge Chinese ‘Reasoning’ Model Rivals OpenAI O1

    Source URL: https://slashdot.org/story/25/01/21/2138247/cutting-edge-chinese-reasoning-model-rivals-openai-o1?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Cutting-Edge Chinese ‘Reasoning’ Model Rivals OpenAI O1 Feedly Summary: AI Summary and Description: Yes Summary: The release of DeepSeek’s R1 model family marks a significant advancement in the availability of high-performing AI models, particularly in the realms of math and coding tasks. With an open MIT license, these models…

  • Hacker News: Contemplative LLMs

    Source URL: https://maharshi.bearblog.dev/contemplative-llms-prompt/ Source: Hacker News Title: Contemplative LLMs Feedly Summary: Comments AI Summary and Description: Yes **Short Summary with Insight:** The text discusses the novel approach of prompting Large Language Models (LLMs) to engage in a contemplation phase before generating answers. By mimicking a reasoning process, which encourages exploration and questioning assumptions, this method…

  • Hacker News: Learning How to Think with Meta Chain-of-Thought

    Source URL: https://arxiv.org/abs/2501.04682 Source: Hacker News Title: Learning How to Think with Meta Chain-of-Thought Feedly Summary: Comments AI Summary and Description: Yes Summary: The document presents a novel framework called Meta Chain-of-Thought (Meta-CoT) aimed at enhancing reasoning capabilities in Large Language Models (LLMs). This framework is positioned to advance AI behavior toward more human-like reasoning,…

  • OpenAI : Deliberative alignment: reasoning enables safer language models

    Source URL: https://openai.com/index/deliberative-alignment Source: OpenAI Title: Deliberative alignment: reasoning enables safer language models Feedly Summary: Deliberative alignment: reasoning enables safer language models Introducing our new alignment strategy for o1 models, which are directly taught safety specifications and how to reason over them. AI Summary and Description: Yes Summary: The text discusses a new alignment strategy…

  • Hacker News: Coconut by Meta AI – Better LLM Reasoning with Chain of Continuous Thought?

    Source URL: https://aipapersacademy.com/chain-of-continuous-thought/ Source: Hacker News Title: Coconut by Meta AI – Better LLM Reasoning with Chain of Continuous Thought? Feedly Summary: Comments AI Summary and Description: Yes Summary: This text presents an innovative approach to enhancing reasoning capabilities in large language models (LLMs) through a method called Chain of Continuous Thought (COCONUT). It highlights…

  • Simon Willison’s Weblog: Trying out QvQ – Qwen’s new visual reasoning model

    Source URL: https://simonwillison.net/2024/Dec/24/qvq/#atom-everything Source: Simon Willison’s Weblog Title: Trying out QvQ – Qwen’s new visual reasoning model Feedly Summary: I thought we were done for major model releases in 2024, but apparently not: Alibaba’s Qwen team just dropped the Apache2 2 licensed QvQ-72B-Preview, “an experimental research model focusing on enhancing visual reasoning capabilities". Their blog…