Bluesky Thread

entropix: qwen vs llama

View original thread
entropix: qwen vs llama

when using entropix, qwen 2.5 7B coder seems to produce much clearer entropy paths for entropix to follow, vs llama 3.1 8B

this makes qwen+entropix a great combination for writing code
entropy graphed against varentropy. this shows clearish parabolic curves, whereas the llama graph is more scattered and less clear
22 2
for the uninitiated — entropix is a dynamic sampler that uses logit entropy/varentropy (variance) to decide between sampler strategies like highest prob, high temp, CoT, etc.

timkellogg.me/blog/2024/10...
timkellogg.me
What is entropix doing? - Tim Kellogg
5
the graphs above are produced with llmri github.com/xjdr-alt/llm...
github.com
llmri/plots.ipynb at main · xjdr-alt/llmri
look how they massacred my boy. Contribute to xjdr-alt/llmri development by creating an account on GitHub.
2
22 likes 2 reposts

More like this

×