LLM Judges Prefers Fluff?

20 July 2024·47 words·1 min · Download pdf

LLM-as-judge will give you a higher score if you throw in a lot of relevant information that doesn’t actually answer the question vs the models that does answer the question but are concise :).

To think of it, that often works with humans as well! https://x.com/corbtt/status/1814056457626862035

Discussion