Bluesky Thread

Olmo 3 7B & 32B base & thinking models

November 20, 2025 View original thread

Olmo 3 7B & 32B base & thinking models

@ai2.bsky.social has done it again, fully open models, fully open process

seems competitive with Qwen 3, excel you can fully reproduce any part of the training process

allenai.org/blog/olmo3

Two-panel line chart on a dark teal background showing model performance during “Base Model Training” (left panel) and “Post-Training” (right panel).

Axes and panels:
• Left panel title: “Base Model Training.” Left y-axis label: “Base Eval Average (%),” running roughly 50–100. Text along the bottom: “includes Pretraining, Midtraining, Long Context.”
• Right panel title: “Post-Training.” Right y-axis label: “Adapt Eval Average (%),” running roughly 30–80. Bottom text: “includes SFT, DPO, RL.”

Main pink curve:
A bright pink line with diamond or cross-shaped markers runs across both panels, climbing steadily.
• In Base Model Training it starts a bit above 60%, rises to about 70%, stays level, then jumps to the mid-70s. A label “Marin 32B” sits near one mid-range point, with a small white sailboat icon on a nearby dashed line.
• In Post-Training it continues upward, with three points in the high-70s to low-80s, forming a nearly flat top. Each point is marked with a pink outlined diamond plus icon.

Other models / baselines:
Several white or pink dashed lines with different icons represent comparison models.
• In the left panel a white dashed line with a sailboat icon starts just above 50% and climbs toward the mid-60s, and another dashed line with a white square-plus “medical cross” icon stays in the high-50s.
• In the right panel, a dashed pink line labeled “OLMo 2 32B” rises from the low-40s to just under 50%. A white dashed line labeled “Apertus 70B” hovers around 60% with a square-plus icon.

Top-right labels:
Near the upper-right corner of the Post-Training panel, clustered labels mark high-performing models:
• “Qwen 3 32B” and “OLMo 3” at the very top near ~80% on the Adapt Eval axis, with star-like white markers.
• Slightly below, “Gemma 3 27B” and “Qwen 2.5 32B” around the low-60s.

47 4

the qwen models have been fantastic for accelerating AI research. i’m hoping this helps even more

tech report: www.datocms-assets.com/64837/176364...

www.datocms-assets.com

7 hours later

bsky.app/profile/nato...

Nathan Lambert @natolambert.bsky.social

We present Olmo 3, our next family of fully open, leading language models.
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

More like this