Bluesky Thread

HRM confirmed by ARC-AGI team, but also dismissed as non-generalizable

August 15, 2025 View original thread

HRM confirmed by ARC-AGI team, but also dismissed as non-generalizable

the magic wasn’t in the hierarchical structure, it was in the outer loop. And the outer loop benefited mostly at training time

i.e. it mostly just memorized answers

arcprize.org/blog/hrm-ana...

arcprize.org

The Hidden Drivers of HRM's Performance on ARC-AGI

We scored on hidden tasks, ran ablations, and found that performance from the Hierarchical Reasoning Model comes from an unexpected source

Tim Kellogg @timkellogg.me

HRM: Hierarchical Reasoning Model

ngl this sounds like bullshit but i don’t think it is

- 27M (million parameters)
- 1000 training examples
- beats o3-mini on ARC-AGI

arxiv.org/abs/2506.21734

26 2

to be clear — it was a legit result. it completed ARC as advertised, no cheating

the problem (as interpreted by the ARC team) is that it won’t work beyond ARC-style problems

3 1

More like this