Bluesky Thread

oh!!! o3-mini now shows its thought trace

February 07, 2025 View original thread

oh!!! o3-mini now shows its thought trace

chatgpt.com/share/67a556...

my theory after reading the s1 paper:

high difficulty (long) traces are most valuable to train on, thus you won’t see the top models like o1-pro and o3 offering up their traces

bsky.app/profile/timk...

Tim Kellogg @timkellogg.me

s1: The $6 R1 Competitor?

This isn't a R1 replication, it's a brilliant breakthrough in data reduction, and just plain dumb engineering ingenuity. I considered not writing this up, but I don't think it's obvious why it's so important. Enjoy!

timkellogg.me/blog/2025/02...

More like this