Skills are learned through RL!
pre-training — individual skills learned
post-training (RL) — composed skills learned
this is very clarifying to me, how RL works
husky-morocco-f72.notion.site/From-f-x-and...
Skills are learned through RL!
View original thread
31
4
in high school wrestling, we spent weeks practicing single moves, over and over, 50 times in a row
that’s what pre-training does. Muscle memory
then we spent a bit more time sparring and learning to chain the moves together to win matches
that’s RL
that’s what pre-training does. Muscle memory
then we spent a bit more time sparring and learning to chain the moves together to win matches
that’s RL
12
how do you learn skills? put them in pre-training
that’s where synthetic data comes in, frequently
i’ve noticed a lot of agentic models being trained on traces from other agents. it’s all about baking in the individual skills
RL is ineffective without learning the core skills
that’s where synthetic data comes in, frequently
i’ve noticed a lot of agentic models being trained on traces from other agents. it’s all about baking in the individual skills
RL is ineffective without learning the core skills
7