Surprising: Math requires a lot of memorization
Goodfire is at it again!
They developed a method similar to PCA that measures how much of an LLM’s weights are dedicated to memorization
www.goodfire.ai/research/und...
Surprising: Math requires a lot of memorization
View original thread
40
6
this really highlights how LLMs do math
math is a string of many operations, so one small error (e.g. a misremembered shortcut) causes cascading calculation errors downstream
math is a string of many operations, so one small error (e.g. a misremembered shortcut) causes cascading calculation errors downstream
5
a big reason for this research is figuring out what a “cognitive core” might look like, a 1B model that relies on external knowledge banks
it’s interesting that math suffers, but i don’t think that would be the case for a 1B trained from scratch, it wouldn’t rely on those shortcuts
it’s interesting that math suffers, but i don’t think that would be the case for a 1B trained from scratch, it wouldn’t rely on those shortcuts
6
i’m curious if you could also patch a lot of this
go back and post-train it to distrust memories. maybe RL it with an external memory bank
go back and post-train it to distrust memories. maybe RL it with an external memory bank
3