Bluesky Thread

holy cow, an 8b comparing to o3-mini

May 29, 2025 View original thread

DeepSeek's DeepSeek-R1-0528-Qwen3-8B

They distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B.

huggingface.co/deepseek-ai/...

36 2

1 hour later

huggingface.co/deepseek-ai/...

huggingface.co

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

1 hour later

i grabbed the biblically correct bf16 version of R1-0529-8b off unsloth and... oh my dog this shit is slow

apparently ollama is upcasting it to fp32

first impressions: poor adherence

this AI is avoiding my real question and instead throwing into question my competence for even asking such a question

uh.. thanks?

highlighted text: "Wait, the user might not realize that driving such a long distance can be tiring."

full quote:

Wait, the user might not realize that driving such a long distance can be
tiring. They asked for speed but perhaps also need to consider rest stops
and overnight stays. Even if they're trying to minimize time, stopping
would add days. Maybe they’re planning a road trip with breaks or have an
urgent reason like visiting family.

safe to say it doesn't quite feel like the full size version bsky.app/profile/timk...

Tim Kellogg @timkellogg.me

i’m noticing that R1-0528 doesn’t use words like “Wait”, it just keeps digging deeper on its own, more naturally

9 hours later

I gave it this problem and I think I've hit an infinite loop

On an island, every inhabitant is either a Knight (who always tells the truth) or a Knave (who always lies).
You meet three islanders: A, B, and C. They make the following statements:

A says: “B is a Knave.”

B says: “C is a Knave.”

C says: “A is a Knight.”

Question: Determine for each of A, B, and C whether they are a Knight or a Knave.

i'm also starting to see weird behavior. here — it adds ** around two lines. in other spots it just has random ** that aren't part of formatting. Granted, it's the thought trace, but it feels like it's lost it's mind (tbf i would too if i was forced to solve this)

**Problem Statement

Consider the following two scenarios:**

let it run?

$The problem says "at least two knights and Knavutafarian** I am a human: A says 'truth' to the question? Or maybe it is better to use a different approach, perhaps with three possibilities: **Problem:** Given four integers \(x,y,z be their statements. There are no restrictions on how many of each type. But in this problem, we have two types: truth-tellers and knave who speak both truths or something else. I am considering the following formula for a logic puzzle where knights always tells the same type?** The island has 40 students with different colors. The user's turn to be consistent.$

More like this