been getting deep into LLM serving and the most frustrating part is that EVERY model has different preferences about everything
i’d like to say, “oh, we’ll just use sglang everywhere”, but no. one model only supports vLLM, another only hangs with tokasaurus (what?)
been getting deep into LLM serving and the most frustrating part is that EVER...
View original thread
31
1
i think a lot of headaches go away if you stick to H100s, but i’m not sure about that either
6