Bluesky Thread

NVIDIA’s moat just got bigger

View original thread
NVIDIA’s moat just got bigger

the Rubin CTX is for inference, and it’s a beast

sold as a rack as the unit, this one optimizes various LLM phases into silicon

notably: it does prefill in fp4 with low memory bandwidth and huge compute

semianalysis.com/2025/09/10/a...
semianalysis.com
Another Giant Leap: The Rubin CPX Specialized Accelerator & Rack
Nvidia announced the Rubin CPX, a solution that is specifically designed to be optimized for the prefill phase, with the single-die Rubin CPX heavily emphasizing compute FLOPS over memory bandwidth…
21 1
there’s always been someone out there saying, “yeah! just do the transformer directly in hardware”

that’s not what this is. this merely provides transistor layouts such that it’s really easy to make LLMs go screaming fast

you could still do scientific compute on them, albeit not in fp4
3
21 likes 1 reposts

More like this

×