Skip to main content

LLM Judges Prefer Fluff—Just Like Humans!

·47 words·1 min

LLM-as-judge will give you a higher score if you throw in a lot of relevant information that doesn’t actually answer the question vs the models that does answer the question but are concise :).

To think of it, that often works with humans as well! https://x.com/corbtt/status/1814056457626862035

Discussion