Simon Willison’s Weblog: r1.py script to run R1 with a min-thinking-tokens parameter

Source URL: https://simonwillison.net/2025/Jan/22/r1py/
Source: Simon Willison’s Weblog
Title: r1.py script to run R1 with a min-thinking-tokens parameter

Feedly Summary: r1.py script to run R1 with a min-thinking-tokens parameter
Fantastically creative hack by Theia Vogel. The DeepSeek R1 family of models output their chain of thought inside a …</think> block. Theia found that you can intercept that closing </think> and replace it with “Wait, but" or "So" or "Hmm" and trick the model into extending its thought process, producing better solutions!
You can stop doing this after a few iterations, or you can keep on denying the </think> string and effectively force the model to "think" forever.
Theia’s code here works against Hugging Face transformers but I’m confident the same approach could be ported to llama.cpp or MLX.
Via @voooooogel
Tags: generative-ai, deepseek, transformers, ai, llms

AI Summary and Description: Yes

Summary: The text discusses a creative technique by Theia Vogel to enhance the output of generative AI models, particularly those in the DeepSeek R1 family, by manipulating the model’s internal thought process coding. This method allows for extended reasoning within AI prompts, yielding potentially better outcomes.

Detailed Description:
The text highlights an innovative approach to improving generative AI outputs through strategic manipulation of model processing. Here are key points worth noting:

– **Methodology**: The technique involves intercepting the closing tag in the model’s output. By replacing it with phrases like “Wait, but” or “So,” the process encourages the model to continue its reasoning.

– **Implications**: This manipulation allows for deeper thinking, which can lead to more refined and improved solutions from AI models. The ability to “think” indefinitely opens up creative avenues for problem-solving within AI applications.

– **Implementation**: The code provided by Theia Vogel is designed for Hugging Face transformers, signaling its applicability within popular frameworks used in AI development.

– **Potential Expansion**: The author expresses confidence that this technique could also be adapted for other frameworks such as llama.cpp or MLX, indicating its versatility.

– **Industry Relevance**: The discussion touches on generative AI, a rapidly evolving field within AI security and development. Techniques like these are critical for AI professionals looking to enhance model performance and security in applications.

– **Engagement with AI Community**: The mention of tagging and sharing within the AI community emphasizes collaborative innovation, which is vital for staying at the forefront of AI security and generative model development.

Overall, this text is of significant interest to professionals in AI security and development, presenting a novel method to enhance model capabilities while encouraging broader discussions within the community.