i read through ~5 pages of the Olmo 3 tech report.. whoah
this is the best and most detailed summary of the current state of SOTA LLM training
nanochat is good for understanding LLM training, this tech report catches you up to SOTA methods
i read through ~5 pages of the Olmo 3 tech report.. whoah
View original thread7 hours later
read a bit on the plane — SO MUCH time spent on data curation and benchmark selection
they have a whole section on dynamically selecting benchmarks based on signal-to-noise
afaict they don’t even talk about baguettotron-style rephrasing
they have a whole section on dynamically selecting benchmarks based on signal-to-noise
afaict they don’t even talk about baguettotron-style rephrasing
3