Bluesky Thread

if i could wish something into being — a complete decoupling of LLM knowledge...

March 18, 2025 View original thread

if i could wish something into being — a complete decoupling of LLM knowledge vs reasoning

seems like the key would be a “database” model that returns queries in vectors, raw information rather than snippets of documents

28 1

the innovation behind such a database would be

1. most of the n*n calculation is cacheable across requests so multi gigabyte context is feasible

2. raw vectors include reasoning across docs, it’s raw info inferred from snippets across hundreds of docs

on my mind — highly sparse MoE models are hot right now, but they’re lame af. you’re forcibly requiring that 7/8ths of your hardware be idle (or 31/32nds, in the case of R1/V3), which is nuts

this is the way: dense reasoning models with huge remote async “knowledge” models

and RAG still doesn’t go away, you need it for rapidly fluctuating or highly structured data

More like this