Simon Willison’s Weblog: Designing agentic loops

Sep 30, 2025

—

Source URL: https://simonwillison.net/2025/Sep/30/designing-agentic-loops/
Source: Simon Willison’s Weblog
Title: Designing agentic loops

Feedly Summary: Coding agents like Anthropic’s Claude Code and OpenAI’s Codex CLI represent a genuine step change in how useful LLMs can be for producing working code. These agents can now directly exercise the code they are writing, correct errors, dig through existing implementation details, and even run experiments to find effective code solutions to problems.
As is so often the case with modern AI, there is a great deal of depth involved in unlocking the full potential of these new tools.
A critical new skill to develop is designing agentic loops.
One way to think about coding agents is that they are brute force tools for finding solutions to coding problems. If you can reduce your problem to a clear goal and a set of tools that can iterate towards that goal a coding agent can often brute force its way to an effective solution.
My preferred definition of an LLM agent is something that runs tools in a loop to achieve a goal. The art of using them well is to carefully design the tools and loop for them to use.

The joy of YOLO mode
Picking the right tools for the loop
Issuing tightly scoped credentials
When to design an agentic loop
This is still a very fresh area

The joy of YOLO mode
Agents are inherently dangerous – they can make poor decisions or fall victim to malicious prompt injection attacks, either of which can result in harmful results from tool calls. Since the most powerful coding agent tool is “run this command in the shell" a rogue agent can do anything that you could do by running a command yourself.
To quote Solomon Hykes:

An AI agent is an LLM wrecking its environment in a loop.

Coding agents like Claude Code counter this by defaulting to asking you for approval of almost every command that they run.
This is kind of tedious, but more importantly, it dramatically reduces their effectiveness at solving problems through brute force.
Each of these tools provides its own version of what I like to call YOLO mode, where everything gets approved by default.
This is so dangerous, but it’s also key to getting the most productive results!
Here are three key risks to consider from unattended YOLO mode.

Bad shell commands deleting or mangling things you care about.
Exfiltration attacks where something steals files or data visible to the agent – source code or secrets held in environment variables are particularly vulnerable here.
Attacks that use your machine as a proxy to attack another target – for DDoS or to disguise the source of other hacking attacks.

If you want to run YOLO mode anyway, you have a few options:

Run your agent in a secure sandbox that restricts the files and secrets it can access and the network connections it can make.
Use someone else’s computer. That way if your agent goes rogue, there’s only so much damage they can do, including wasting someone else’s CPU cycles.
Take a risk! Try to avoid exposing it to potential sources of malicious instructions and hope you catch any mistakes before they cause any damage.

Most people choose option 3.
Despite the existence of container escapes I think option 1 using Docker or the new Apple container tool is a reasonable risk to accept for most people.
Option 2 is my favorite. I like to use GitHub Codespaces for this – it provides a full container environment on-demand that’s accessible through your browser and has a generous free tier too. If anything goes wrong it’s a Microsoft Azure machine somewhere that’s burning CPU and the worst that can happen is code you checked out into the environment might be exfiltrated by an attacker, or bad code might be pushed to the attached GitHub repository.
There are plenty of other agent-like tools that run code on other people’s computers. Code Interpreter mode in both ChatGPT and Claude can go a surprisingly long way here. I’ve also had a lot of success (ab)using OpenAI’s Codex Cloud.
Coding agents themselves implement various levels of sandboxing, but so far I’ve not seen convincing enough documentation of these to trust them.
Picking the right tools for the loop
Now that we’ve found a safe (enough) way to run in YOLO mode, the next step is to decide which tools we need to make available to the coding agent.
You can bring MCP into the mix at this point, but I find it’s usually more productive to think in terms of shell commands instead. Coding agents are really good at running shell commands!
If your environment allows them the necessary network access, they can also pull down additional packages from NPM and PyPI and similar. Ensuring your agent runs in an environment where random package installs don’t break things on your main computer is an important consideration as well!
Rather than leaning on MCP, I like to create an AGENTS.md (or equivalent) file with details of packages I think they may need to use.
For a project that involved taking screenshots of various websites I installed my own shot-scraper CLI tool and dropped the following in AGENTS.md:
To take a screenshot, run:

shot-scraper http://www.example.com/ -w 800 -o example.jpg

Just that one example is enough for the agent to guess how to swap out the URL and filename for other screenshots.
Good LLMs already know how to use a bewildering array of existing tools. If you say "use playwright python" or "use ffmpeg" most models will use those effectively – and since they’re running in a loop they can usually recover from mistakes they make at first and figure out the right incantations without extra guidance.
Issuing tightly scoped credentials
In addition to exposing the right commands, we also need to consider what credentials we should expose to those commands.
Ideally we wouldn’t need any credentials at all – plenty of work can be done without signing into anything or providing an API key – but certain problems will require authenticated access.
This is a deep topic in itself, but I have two key recommendations here:

Try to provide credentials to test or staging environments where any damage can be well contained.
If a credential can spend money, set a tight budget limit.

I’ll use an example to illustrate. A while ago I was investigating slow cold start times for a scale-to-zero application I was running on Fly.io.
I realized I could work a lot faster if I gave Claude Code the ability to directly edit Dockerfiles, deploy them to a Fly account and measure how long they took to launch.
Fly allows you to create organizations, and you can set a budget limit for those organizations and issue a Fly API key that can only create or modify apps within that organization…
So I created a dedicated organization for just this one investigation, set a $5 budget, issued an API key and set Claude Code loose on it!
In that particular case the results weren’t useful enough to describe in more detail, but this was the project where I first realized that "designing an agentic loop" was an important skill to develop.
When to design an agentic loop
Not every problem responds well to this pattern of working. The thing to look out for here are problems with clear success criteria where finding a good solution is likely to involve (potentially slightly tedious) trial and error.
Any time you find yourself thinking "ugh, I’m going to have to try a lot of variations here" is a strong signal that an agentic loop might be worth trying!
A few examples:

Debugging: a test is failing and you need to investigate the root cause. Coding agents that can already run your tests can likely do this without any extra setup.

Performance optimization: this SQL query is too slow, would adding an index help? Have your agent benchmark the query and then add and drop indexes (in an isolated development environment!) to measure their impact.

Upgrading dependencies: you’ve fallen behind on a bunch of dependency upgrades? If your test suite is solid an agentic loop can upgrade them all for you and make any minor updates needed to reflect breaking changes. Make sure a copy of the relevant release notes is available, or that the agent knows where to find them itself.

Optimizing container sizes: Docker container feeling uncomfortably large? Have your agent try different base images and iterate on the Dockerfile to try to shrink it, while keeping the tests passing.

A common theme in all of these is automated tests. The value you can get from coding agents and other LLM coding tools is massively amplified by a good, cleanly passing test suite. Thankfully LLMs are great for accelerating the process of putting one of those together, if you don’t have one yet.
This is still a very fresh area
Designing agentic loops is a very new skill – Claude Code was first released in just February 2025!
I’m hoping that giving it a clear name can help us have productive conversations about it. There’s so much more to figure out about how to use these tools as effectively as possible.
Tags: ai, generative-ai, llms, ai-assisted-programming, ai-agents, coding-agents

AI Summary and Description: Yes

**Summary:** The text delves into the concept of coding agents, particularly focusing on tools like Anthropic’s Claude Code and OpenAI’s Codex CLI, emphasizing their potential and risks in producing working code. With an increasing reliance on AI for development, understanding how to leverage such agents through effective design and security measures becomes crucial for professionals in software development and AI security.

**Detailed Description:**
The article discusses the transformative impact of advanced coding agents powered by Large Language Models (LLMs) on programming practices. It highlights key aspects of utilizing these agents while addressing the associated risks and necessary safeguards.

– **Key Innovations:**
– **Coding Agents:** Tools like Claude Code and Codex CLI can directly execute and evaluate code, making them versatile for problem-solving.
– **Agentic Loops:** A novel skill where coding agents are designed to iterate through a problem-solving process, refining their approach based on achieved outcomes.

– **Risks of Coding Agents:**
– **Poor Decision Making:** Agents can inadvertently execute harmful commands or fall victim to prompt injection attacks.
– **Dangerous YOLO Mode:** The concept of YOLO mode allows agents to run commands with minimal supervision, which can lead to:
– Bad shell commands causing data loss.
– Exfiltration attacks targeting sensitive files.
– Using the user’s machine as a proxy for further attacks.

– **Mitigation Strategies:**
– **Secure Sandbox Environments:** Running coding agents in restricted settings can limit their access to harmful commands and sensitive data.
– **Tightly Scoped Credentials:** When credentials are necessary, using them only in test environments with budget limits is recommended.
– **External Computing Resources:** Utilizing platforms like GitHub Codespaces or other container environments can mitigate risk while enabling coding agent functionalities.

– **Design Considerations:**
– **Tool Selection:** Choosing appropriate tools and commands for coding agents enhances their efficacy.
– **Scope of Credentials:** Limiting credential access to the least privilege necessary is essential for maintaining security.

– **Situations for Implementation:**
– Problems with clear success metrics.
– Potential for iterative trial-and-error solutions, such as debugging, performance optimization, and dependency management.

– **Future Implications:**
– The concept of designing agentic loops is emerging, given the infancy of these tools (e.g., Claude Code was released in February 2025). The ongoing development will require professionals to adapt and innovate in their use of AI tools.

Overall, the text serves as a rich foundation for security and compliance professionals to consider the challenges and opportunities presented by LLM-powered coding agents in software security and AI governance. Understanding the balance between functionality and risk management is key to leveraging these emerging technologies effectively.

.NET 1 2 2025 3 5 800 a access account Act advanced advanced coding age agent agent tool agentic agentic loop agents AGI AI AI governance AI security AI tool AI tools ai-agents ai-assisted-programming All allow and and Risk Anthropic Anthropic’s Claude API app Apple Application Aria ARM art as assisted at ated attack attacker attacks Auto Azure based benchmark Bi bot Box breaking changes browser brute brute force budget Bug by C care challenge challenges chat ChatGPT CI CIA Claude Claude Code CleaR cli Cloud co code Code Interpreter code solutions codex coding coding agent coding agents coding problem coding tool coding tools Col command compliance compliance professionals compute computer Computing computing resources concept container container environment container escape conversation CPU credential credentials criteria critical D data data loss DDoS de Debugging decision decisions deep default DeFi definition demand dependencies dependency dependency management depth design design considerations development development environment Docker Docker container dockerfile document documentation DoS e effective effectiveness election ELF emerging Emerging Technologies end environment environment variables environments ERP error errors escape exfiltration exp External fail fast faster fault ffmpeg file first Fly Fly.io following for free Free tier full function functionality future future implications g Gen generative git GitHub GitHub repository Go goal governance GPT grade grading gs guidance H hack hacking harm high Highlight HR http HTTPS image impact implementation implementation detail implications in injection injection attacks innovation Innovations instruction inter interpret interpreter investigation io Iron IRS issue ite J Just k keeping Key l Lance language language model language models large large language model large language models Large Language Models (LLMs) Lead least Least Priv least privilege led level Li limiting llm llms lm Lock long loop low M mac machine making malicious instructions man management mass mcp measures metrics Micro Microsoft Microsoft Azure Mila mini mistakes mitigation mitigation strategies Mode model models Modern ModI my N NCA network network access new next NGO no notes npm o oE of on one only ons open openai OPM ops opt optimization options organization organizations ory oS oss other out outcome over per performance performance optimization platform platforms play Playwright point potential Power powered practices pre privilege pro problem problem-solving process product professionals programming programming practices project prompt prompt injection attack prompt injection attacks proxy ps Py pypi Python Q R rag Rama rate Ray RCE re ready real recommendations red release release notes repository resource resources right Risk risk management risks RMF Ro Root row RSA Rust s s pattern safe safeguards sandbox sandbox environment sandbox environments sandboxing Scale scope scraper screenshot sec secrets secure secure sandbox security security and compliance security measure security measures self sensitive data sensitive files settings shell commands shot side Sig Signal signing signing in Sim Simon Willison size sizes SoC software software development software security solid solutions solving source source code space sql SSE SSO staging environment STAR start STIG strategies T Tags: Tails taking target tech technologies ted test test suite text the thinking Time times to tool tools Tor TP Transform transformative trial trust two UI UN under up update updates upgrade upgrades upgrading US use user uth V val version Vision WAN Ware web website Well Wi worst writing x yt z zero