Bluesky Thread

fascinating blog by a developer on K2

View original thread
fascinating blog by a developer on K2

they talk about what chatbots can be. why markdown? why not directly emit frontends?

i’ve only done one Kimi Research, and it was formatted like a React app

Chinese (original) bigeagle.me/2025/07/kimi...

English (ChatGPT) chatgpt.com/share/687410...
bigeagle.me
写在 Kimi K2 发布之后:再也不仅仅是 ChatBot | K.I.S.S
27 4
this is cool — during training they generated tools, lots of tools. simple tools, complex tools, tools that are intertwined with each other, etc.

they RL’d merely for a model that’s really dang good at figuring out how to use a tool set to solve a problem

any tool set
Therefore, we designed a fairly elegant workflow, letting the model synthesize massive Tool Specs and usage scenarios itself, synthesizing very diverse tool-calling data through a multiagent approach. The results were indeed good.
18 2
the fact that it’s MCP is besides the point, other than the MCP ecosystem is huge

they thought about training for specific tools, but blender requires software installed and others require logins. basically nothing is universally useful

so instead they decided to learn general tool use
4
all of this makes oh so much sense
Why Open Source
First, of course, it's to gain reputation. If K2 were only a closed-source service, it definitely wouldn't have this much attention and discussion now. It might even suffer criticism like Grok4 despite doing well.
Second, we can leverage community efforts to improve the technical ecosystem. Within 24 hours of open-sourcing, we saw the community create K2's MLX implementation,
4-bit quantization, etc. - things we really couldn't do with our limited manpower.
But more importantly: open source means higher technical standards, forcing us to create better models, more aligned with AGI goals.
13
explaining why they open sourced — to ensure that it’s broadly useful

OpenAI self-admits that they optimize their models for ChatGPT, o3 was made for DeepResearch

Moonshot was dissatisfied with that
For a closed-source ChatBot service, users have no idea what workflow or how many models are behind it. I've heard rumors that some major companies have dozens of models, hundreds of scenario classifications, and countless workflows behind their interfaces, claiming this is an "MoE model." Under "application-first" or "user experience-first" values, this approach is very natural and far more cost-effective than a single model.
But this clearly isn't what AGI should look like. For a startup like Kimi, this approach not only makes you increasingly mediocre and greatly hinders technical progress but also makes it impossible to compete with major companies that have PMs polishing every button.
12
27 likes 4 reposts

More like this

×