Bytedance SEED-X: a 7B that beats Gemini-2.5-pro on language translation
- reasoning model
- pre-trained on 6T tokens
- structured like a mistral
it's entire pre-training dataset is oriented around language translation, often using bilingual samples
Bytedance SEED-X: a 7B that beats Gemini-2.5-pro on language translation
View original thread
29
i have mixed feelings on this
A) it’s just translation, it’s a narrow task
B) it’s translation! you have to deeply understand the topic and have some level of theory of mind in order to do it well
A) it’s just translation, it’s a narrow task
B) it’s translation! you have to deeply understand the topic and have some level of theory of mind in order to do it well
6
i was an intern for a linguistics org out of college and they were kind enough to force me to do linguistics courses. i did just enough basic translation to appreciate that its not a simple isomorphic mapping between languages
it’s basically art. the literal translation is usually not that good
it’s basically art. the literal translation is usually not that good
6