Bluesky Thread

even MORE??

View original thread
even MORE??
Omar Sanseviero
@osansev... • 2h
We're just warming up :) see you tomorrow
4:01
Follow
Omar Sanseviero
@osanseviero
Developer Experience Lead at @GoogleDeepMind Building Gemini API, Gemma, Al Studio and more Al products. My views ex-Chief Llama Officer @huggingface I
28 1
1 hour later
Google DeepMind employees are teasing again

just imagine if this is true..
A dark scatterplot titled “ARC-AGI-2 LEADERBOARD” overlaid with bold red annotations.

The base chart shows model performance (Score %) on the vertical axis and cost per task (log scale) on the horizontal axis. Most points cluster near the bottom; two bright green triangles mark standout Gemini models near 30% and 50%.

Your annotation adds:
	•	A large red circle drawn around an empty region left of the “Gemini 3 Pro” point, roughly spanning 15%–30% score and costs between $0.05 and $0.50.
	•	Three large red question marks (???) inside the circle.
	•	A thick red arrow pointing from the circled region toward the Gemini 3 Pro point around 30% at ~$1.
	•	A second long red arrow extending toward the top-right Gemini 3 Deep Think (Preview) point near 50% at $100+.

The annotation is questioning the large empty gap in the chart—why no models appear between the mid-cost, mid-score region and the high-performing Gemini models.
10 1
28 likes 1 reposts

More like this

×