Tag: token
-
Cloud Blog: Widespread Data Theft Targets Salesforce Instances via Salesloft Drift
Source URL: https://cloud.google.com/blog/topics/threat-intelligence/data-theft-salesforce-instances-via-salesloft-drift/ Source: Cloud Blog Title: Widespread Data Theft Targets Salesforce Instances via Salesloft Drift Feedly Summary: Written by: Austin Larsen, Matt Lin, Tyler McLellan, Omar ElAhdan Introduction Google Threat Intelligence Group (GTIG) is issuing an advisory to alert organizations about a widespread data theft campaign, carried out by the actor tracked as UNC6395.…
-
Cloud Blog: vLLM Performance Tuning: The Ultimate Guide to xPU Inference Configuration
Source URL: https://cloud.google.com/blog/topics/developers-practitioners/vllm-performance-tuning-the-ultimate-guide-to-xpu-inference-configuration/ Source: Cloud Blog Title: vLLM Performance Tuning: The Ultimate Guide to xPU Inference Configuration Feedly Summary: Additional contributors include Hossein Sarshar, Ashish Narasimham, and Chenyang Li. Large Language Models (LLMs) are revolutionizing how we interact with technology, but serving these powerful models efficiently can be a challenge. vLLM has rapidly become…
-
Simon Willison’s Weblog: Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet
Source URL: https://simonwillison.net/2025/Aug/25/agentic-browser-security/#atom-everything Source: Simon Willison’s Weblog Title: Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet Feedly Summary: Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet The security team from Brave took a look at Comet, the LLM-powered “agentic browser" extension from Perplexity, and unsurprisingly found security holes you can drive a truck…
-
Simon Willison’s Weblog: DeepSeek 3.1
Source URL: https://simonwillison.net/2025/Aug/22/deepseek-31/#atom-everything Source: Simon Willison’s Weblog Title: DeepSeek 3.1 Feedly Summary: DeepSeek 3.1 The latest model from DeepSeek, a 685B monster (like DeepSeek v3 before it) but this time it’s a hybrid reasoning model. DeepSeek claim: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly. Drew Breunig points out that their benchmarks…
-
Simon Willison’s Weblog: too many model context protocol servers and LLM allocations on the dance floor
Source URL: https://simonwillison.net/2025/Aug/22/too-many-mcps/#atom-everything Source: Simon Willison’s Weblog Title: too many model context protocol servers and LLM allocations on the dance floor Feedly Summary: too many model context protocol servers and LLM allocations on the dance floor Useful reminder from Geoffrey Huntley of the infrequently discussed significant token cost of using MCP. Geoffrey estimate estimates that…
-
Cloud Blog: A Cereal Offender: Analyzing the CORNFLAKE.V3 Backdoor
Source URL: https://cloud.google.com/blog/topics/threat-intelligence/analyzing-cornflake-v3-backdoor/ Source: Cloud Blog Title: A Cereal Offender: Analyzing the CORNFLAKE.V3 Backdoor Feedly Summary: Written by: Marco Galli Welcome to the Frontline Bulletin Series Straight from Mandiant Threat Defense, the “Frontline Bulletin" series brings you the latest on the most intriguing compromises we are seeing in the wild right now, equipping our community…
-
Simon Willison’s Weblog: llama.cpp guide: running gpt-oss with llama.cpp
Source URL: https://simonwillison.net/2025/Aug/19/gpt-oss-with-llama-cpp/ Source: Simon Willison’s Weblog Title: llama.cpp guide: running gpt-oss with llama.cpp Feedly Summary: llama.cpp guide: running gpt-oss with llama.cpp Really useful official guide to running the OpenAI gpt-oss models using llama-server from llama.cpp – which provides an OpenAI-compatible localhost API and a neat web interface for interacting with the models. TLDR version…
-
Cloud Blog: Rightsizing LLM Serving on vLLM for GPUs and TPUs
Source URL: https://cloud.google.com/blog/topics/developers-practitioners/rightsizing-llm-serving-on-vllm-for-gpus-and-tpus/ Source: Cloud Blog Title: Rightsizing LLM Serving on vLLM for GPUs and TPUs Feedly Summary: Additional contributors include Hossein Sarshar and Ashish Narasimham. Large Language Models (LLMs) are revolutionizing how we interact with technology, but serving these powerful models efficiently can be a challenge. vLLM has rapidly become the primary choice for…
-
Cloud Blog: An efficient path to production AI: Kakao’s journey with JAX and Cloud TPUs
Source URL: https://cloud.google.com/blog/products/infrastructure-modernization/kakaos-journey-with-jax-and-cloud-tpus/ Source: Cloud Blog Title: An efficient path to production AI: Kakao’s journey with JAX and Cloud TPUs Feedly Summary: When your messaging platform serves 49 million people – 93% of South Korea’s population – every technical decision carries enormous weight. The engineering team at Kakao faced exactly this challenge when their existing…
-
Simon Willison’s Weblog: Google Gemini URL Context
Source URL: https://simonwillison.net/2025/Aug/18/google-gemini-url-context/ Source: Simon Willison’s Weblog Title: Google Gemini URL Context Feedly Summary: Google Gemini URL Context New feature in the Gemini API: you can now enable a url_context tool which the models can use to request the contents of URLs as part of replying to a prompt. I released llm-gemini 0.25 with a…