gpt-oss, OpenAI's open weights model
120B & 20B variants, both MoE with 4 experts active
openai.com/index/introd...
gpt-oss, OpenAI's open weights model
View original thread
44
11
paper: Estimating risks in open weights LLMs
An "academic" internal rationalization for OpenAI to release an open weights model
Basically they fine tuned gpt-oss a few times and determined that it's no worse than closed-weights models
cdn.openai.com/pdf/231bf018...
An "academic" internal rationalization for OpenAI to release an open weights model
Basically they fine tuned gpt-oss a few times and determined that it's no worse than closed-weights models
cdn.openai.com/pdf/231bf018...
7
1
paper: gpt-oss system card
it's a full dive into model architecture, data, training phases, and of course the safety report
interesting: it indeed was post-trained as in 4-bit (albeit MXFP4 not FP4). Maybe the leak was real?
cdn.openai.com/pdf/419b6906...
it's a full dive into model architecture, data, training phases, and of course the safety report
interesting: it indeed was post-trained as in 4-bit (albeit MXFP4 not FP4). Maybe the leak was real?
cdn.openai.com/pdf/419b6906...
6
1
1 hour later
this is the first model i'm aware of that runs on ollama by default how it was trained (MXFP4 quantized)
1
this model is super nerdy
feels like the character of all openai models combined and boiled for hours until all that's left is.. whatever this is
feels like the character of all openai models combined and boiled for hours until all that's left is.. whatever this is
6
is a ChatGPT Pro subscription too expensive? try gpt-oss in pro mode!
Python function docs:
“””
Fan out n_runs parallel generations at T=0.9 and synthesize a final answer at T=0.2.
“””
github.com/mshumer/gpt-...
Python function docs:
“””
Fan out n_runs parallel generations at T=0.9 and synthesize a final answer at T=0.2.
“””
github.com/mshumer/gpt-...
6
1
gpt-oss follows the harmony message format
messages will come back as commentary, analysis and final. this is different than just reasoning and not. the models stick firmly to this format
cookbook.openai.com/articles/ope...
messages will come back as commentary, analysis and final. this is different than just reasoning and not. the models stick firmly to this format
cookbook.openai.com/articles/ope...
7
2