When two LLMs debate, both think they’ll win
Absolutely fascinating paper shows that LLMs basically cannot judge their own performance. None of the prompting techniques worked
arxiv.org/abs/2505.19184
When two LLMs debate, both think they’ll win
View original threadHumans are also overconfident, but they adjust that confidence much more often than LLMs
6
RLHF exacerbates overconfidence
i.e. when we train LLMs to be more like us, that’s when the overconfidence gets introduced
again, LLMs are a mirror into society
i.e. when we train LLMs to be more like us, that’s when the overconfidence gets introduced
again, LLMs are a mirror into society
10
The entire discussion section is a ride
obviously this is very important to be aware of when building agents
obviously this is very important to be aware of when building agents
7