Tag: evaluator
-
Hacker News: Coping with dumb LLMs using classic ML
Source URL: https://softwaredoug.com/blog/2025/01/21/llm-judge-decision-tree Source: Hacker News Title: Coping with dumb LLMs using classic ML Feedly Summary: Comments AI Summary and Description: Yes Summary: The text provides an innovative approach to utilizing local LLMs (large language models) to assess product relevance for e-commerce search queries. By collecting data on LLM decisions and comparing them against human…
-
Chip Huyen: Common pitfalls when building generative AI applications
Source URL: https://huyenchip.com//2025/01/16/ai-engineering-pitfalls.html Source: Chip Huyen Title: Common pitfalls when building generative AI applications Feedly Summary: As we’re still in the early days of building applications with foundation models, it’s normal to make mistakes. This is a quick note with examples of some of the most common pitfalls that I’ve seen, both from public case…
-
Unit 42: Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability
Source URL: https://unit42.paloaltonetworks.com/?p=138017 Source: Unit 42 Title: Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability Feedly Summary: The jailbreak technique “Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails. The post Bad Likert Judge: A Novel Multi-Turn Technique to…