NVIDIA’s moat just got bigger
the Rubin CTX is for inference, and it’s a beast
sold as a rack as the unit, this one optimizes various LLM phases into silicon
notably: it does prefill in fp4 with low memory bandwidth and huge compute
semianalysis.com/2025/09/10/a...
More like this
×