Bluesky Thread

R1 & K2 are high taste models. For sure the only open models that are high ta...

July 24, 2025 View original thread

R1 & K2 are high taste models. For sure the only open models that are high taste.

the fact that they've done basically zero RLHF and very little human alignment raises some questions about alignment work

25 3

then there's this whole other thing where the US gov't is trying to force alignment, when there's decent evidence that we really don't know how alignment actually works, even in non-controversial areas

bsky.app/profile/timk...

Tim Kellogg @timkellogg.me

the irony here is that Grok 3-4 are the only models to violate this

“Developers shall not intentionally encode partisan or ideological judgments into an LLM's outputs”

Sec. 3. Unbiased Al Principles. It is the policy of the United States to promote the innovation and use of trustworthy Al. To advance that policy, agency heads shall, consistent with applicable law and in consideration of guidance issued pursuant to section 4 of this order, procure only those LLMs developed in accordance with the following two principles (Unbiased Al Principles):
(a) Truth-seeking. LLMs shall be truthful in responding to user prompts seeking factual information or analysis. LLMs shall prioritize historical accuracy, scientific inquiry, and objectivity, and shall acknowledge uncertainty where reliable information is incomplete or contradictory.
(b) Ideological Neutrality. LLMs shall be neutral, nonpartisan tools that do not manipulate responses in favor of ideological dogmas such as DEl. Developers shall not intentionally encode partisan or ideological judgments into an LLM's outputs unless those judgments are prompted by or otherwise readily accessible to the end user.

More like this