Large Language Diffusion Models
A wildly new AI architecture, this uses diffusion (all tokens at once), not next token prediction
ml-gsai.github.io/LLaDA-demo/
Large Language Diffusion Models
View original threadi’m trying to wrap my head around why this is a good thing, but the reliable parts of twitter are taken with it, so i guess i just have to look harder
3
9 hours later
consider my head wrapped, here's the write-up: bsky.app/profile/timk...
LLMs That Don't Gaslight You
A new language model uses diffusion instead of next-token prediction. That means the text it can back out of a hallucination before it commits. This is a big win for areas like law & contracts, where global consistency is valued
timkellogg.me/blog/2025/02...
A new language model uses diffusion instead of next-token prediction. That means the text it can back out of a hallucination before it commits. This is a big win for areas like law & contracts, where global consistency is valued
timkellogg.me/blog/2025/02...
3