I got access to Gemini Diffusion. It definitely has small model feels, but i like it
long responses appear in evenly-sized chunks. so i think they're doing like 1000 tokens at a time. i did not anticipate that but it makes sense
I got access to Gemini Diffusion. It definitely has small model feels, but i ...
View original threadooooo, i'm definitely right about that, not making it up. It did this in chunks and eventually gave a "server busy" error
also, easy problems go A LOT faster than hard problems
(btw this is about Grok a dinosaur hunter, Gemini's own story)
also, easy problems go A LOT faster than hard problems
(btw this is about Grok a dinosaur hunter, Gemini's own story)
2
so i suppose that means this could be unlimited output, it just loops until it generates an <|eot|> token
i was thinking these are good for fixed upper-bound latency, but that's really not true. i suppose the bigger thing is that they do easy problems fast (code editing????)
i was thinking these are good for fixed upper-bound latency, but that's really not true. i suppose the bigger thing is that they do easy problems fast (code editing????)
3
i gave it a logic problem and it indeed tried to solve it step-by-step, like a reasoning model
it got cut off by the "server busy" error, but it certainly seems like traditional test-time-compute is not off the table for diffusion models
it got cut off by the "server busy" error, but it certainly seems like traditional test-time-compute is not off the table for diffusion models
4