DeepSeek did the “one more thing” 🙄
but guys, check this out, they go into detail on how they run inference on V3/R1, how they partition the experts across lots of nodes and pipeline attention and..
just read this 🤯
github.com/deepseek-ai/...
More like this
×