Tag: llm

  • Simon Willison’s Weblog: gpt-5 and gpt-5-mini rate limit updates

    Source URL: https://simonwillison.net/2025/Sep/12/gpt-5-rate-limits/#atom-everything Source: Simon Willison’s Weblog Title: gpt-5 and gpt-5-mini rate limit updates Feedly Summary: gpt-5 and gpt-5-mini rate limit updates OpenAI have increased the rate limits for their two main GPT-5 models. These look significant: gpt-5 Tier 1: 30K β†’ 500K TPM (1.5M batch) Tier 2: 450K β†’ 1M (3M batch) Tier 3:…

  • Simon Willison’s Weblog: Comparing the memory implementations of Claude and ChatGPT

    Source URL: https://simonwillison.net/2025/Sep/12/claude-memory/#atom-everything Source: Simon Willison’s Weblog Title: Comparing the memory implementations of Claude and ChatGPT Feedly Summary: Claude Memory: A Different Philosophy Shlok Khemani has been doing excellent work reverse-engineering LLM systems and documenting his discoveries. Last week he wrote about ChatGPT memory. This week it’s Claude. Claude’s memory system has two fundamental characteristics.…

  • Simon Willison’s Weblog: Defeating Nondeterminism in LLM Inference

    Source URL: https://simonwillison.net/2025/Sep/11/defeating-nondeterminism/#atom-everything Source: Simon Willison’s Weblog Title: Defeating Nondeterminism in LLM Inference Feedly Summary: Defeating Nondeterminism in LLM Inference A very common question I see about LLMs concerns why they can’t be made to deliver the same response to the same prompt by setting a fixed random number seed. Like many others I had…

  • AWS Open Source Blog: Strands Agents and the Model-Driven Approach

    Source URL: https://aws.amazon.com/blogs/opensource/strands-agents-and-the-model-driven-approach/ Source: AWS Open Source Blog Title: Strands Agents and the Model-Driven Approach Feedly Summary: Until recently, building AI agents meant wrestling with complex orchestration frameworks. Developers wrote elaborate state machines, predefined workflows, and extensive error-handling code to guide language models through multi-step tasks. We needed to build elaborate decision trees to handle…

  • Simon Willison’s Weblog: Claude API: Web fetch tool

    Source URL: https://simonwillison.net/2025/Sep/10/claude-web-fetch-tool/#atom-everything Source: Simon Willison’s Weblog Title: Claude API: Web fetch tool Feedly Summary: Claude API: Web fetch tool New in the Claude API: if you pass the web-fetch-2025-09-10 beta header you can add {“type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5} to your "tools" list and Claude will gain the ability to fetch content from…

  • Cloud Blog: Fast and efficient AI inference with new NVIDIA Dynamo recipe on AI Hypercomputer

    Source URL: https://cloud.google.com/blog/products/compute/ai-inference-recipe-using-nvidia-dynamo-with-ai-hypercomputer/ Source: Cloud Blog Title: Fast and efficient AI inference with new NVIDIA Dynamo recipe on AI Hypercomputer Feedly Summary: As generative AI becomes more widespread, it’s important for developers and ML engineers to be able to easily configure infrastructure that supports efficient AI inference, i.e., using a trained AI model to make…

  • Cloud Blog: Our approach to carbon-aware data centers: Central data center fleet management

    Source URL: https://cloud.google.com/blog/topics/sustainability/googles-approach-to-carbon-aware-data-center/ Source: Cloud Blog Title: Our approach to carbon-aware data centers: Central data center fleet management Feedly Summary: Data centers are the engines of the cloud, processing and storing the information that powers our daily lives. As digital services grow, so do our data centers and we are working to responsibly manage them.…

  • Docker: From Hallucinations to Prompt Injection: Securing AI Workflows at Runtime

    Source URL: https://www.docker.com/blog/secure-ai-agents-runtime-security/ Source: Docker Title: From Hallucinations to Prompt Injection: Securing AI Workflows at Runtime Feedly Summary: How developers are embedding runtime security to safely build with AI agents Introduction: When AI Workflows Become Attack Surfaces The AI tools we use today are powerful, but also unpredictable and exploitable. You prompt an LLM and…