Tag: llms
-
Hacker News: Task-Specific LLM Evals That Do and Don’t Work
Source URL: https://eugeneyan.com/writing/evals/ Source: Hacker News Title: Task-Specific LLM Evals That Do and Don’t Work Feedly Summary: Comments AI Summary and Description: Yes Summary: The text presents a comprehensive overview of evaluation metrics for machine learning tasks, specifically focusing on classification, summarization, and translation within the context of large language models (LLMs). It highlights the…
-
The Register: Microsoft dangles $10K for hackers to hijack LLM email service
Source URL: https://www.theregister.com/2024/12/09/microsoft_llm_prompt_injection_challenge/ Source: The Register Title: Microsoft dangles $10K for hackers to hijack LLM email service Feedly Summary: Outsmart an AI, win a little Christmas cash Microsoft and friends have challenged AI hackers to break a simulated LLM-integrated email client with a prompt injection attack – and the winning teams will share a $10,000…
-
Simon Willison’s Weblog: Quoting Ethan Mollick
Source URL: https://simonwillison.net/2024/Dec/7/ethan-mollick/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Ethan Mollick Feedly Summary: A test of how seriously your firm is taking AI: when o-1 (& the new Gemini) came out this week, were there assigned folks who immediately ran the model through internal, validated, firm-specific benchmarks to see how useful it as? Did you…
-
Simon Willison’s Weblog: New Gemini model: gemini-exp-1206
Source URL: https://simonwillison.net/2024/Dec/6/gemini-exp-1206/#atom-everything Source: Simon Willison’s Weblog Title: New Gemini model: gemini-exp-1206 Feedly Summary: New Gemini model: gemini-exp-1206 Google’s Jeff Dean: Today’s the one year anniversary of our first Gemini model releases! And it’s never looked better. Check out our newest release, Gemini-exp-1206, in Google AI Studio and the Gemini API! I upgraded my llm-gemini…
-
Embrace The Red: Terminal DiLLMa: LLM-powered Apps Can Hijack Your Terminal Via Prompt Injection
Source URL: https://embracethered.com/blog/posts/2024/terminal-dillmas-prompt-injection-ansi-sequences/ Source: Embrace The Red Title: Terminal DiLLMa: LLM-powered Apps Can Hijack Your Terminal Via Prompt Injection Feedly Summary: Last week Leon Derczynski described how LLMs can output ANSI escape codes. These codes, also known as control characters, are interpreted by terminal emulators and modify behavior. This discovery resonates with areas I had…