Bluesky Thread

Sparse Circuits

November 13, 2025 View original thread

Sparse Circuits

a new mech interp paper from OpenAI proposes a way to train models so that they’re natively easier to understand

openai.com/index/unders...

openai.com

Understanding neural networks through sparse circuits

We trained models to think in simpler, more traceable steps—so we can better understand how they work.

23 3

it’s exactly what it sounds like. a much larger model is trained such that its interconnections can be disentangled and easier to trace and observe

But what if we trained untangled neural networks, with many more neurons, but where each neuron has only a few dozen connections? Then maybe the resulting network will be simpler, and easier to understand. This is the central research bet of our work.

More like this