Bluesky Thread

A medical paper from Microsoft lists the previously unknown model sizes of po...

View original thread
A medical paper from Microsoft lists the previously unknown model sizes of popular closed LLMs

- Sonnet3.5: ~175B
- GPT3.5-turbo: 175B
- GPT4: 1.76T
- GPT4o: 200B
- GPT4o-mini: 8B
- o1-mini: 100B
- o1-preview: 300B

arxiv.org/abs/2412.19260
This image shows section 5.1 Language Models from a document’s “Experiments & Results” chapter. It details the language models used in the study:
	1.	Phi-3-7B: A Small Language Model (SLM) with 7 billion parameters [Abdin et al., 2024].
	2.	Claude 3.5 Sonnet (dated 2024-10-22): A ~175B parameter model offering state-of-the-art performance in coding, vision, and reasoning [Anthropic, 2024].
	3.	Gemini 2.0 Flash: Described as Google’s most advanced Gemini model [Google, 2024], with mentions of other large models like Med-PaLM (540B) [Singhal et al., 2023].
	4.	ChatGPT (~175B) and GPT-4 (~1.76T): Called “high-intelligence” models [OpenAI, 2023a & 2023b].
	5.	GPT-4o (~200B): Promoted as “GPT-4-level intelligence but faster” [OpenAI, 2024a], and includes GPT-4o-mini (~8B, dated 2024-05-13) [OpenAI, 2024b].
	6.	o1-mini (~100B, dated 2024-09-12) [OpenAI, 2024c], and o1-preview (~300B, same date) [OpenAI, 2024d], both described as having “new AI capabilities” for complex reasoning.

A note at the bottom explains that parameter sizes are estimates from public articles, not officially confirmed, and provided only for context. It advises readers to consult original documentation for accurate details. Also mentioned: Phi-3 and Claude required minimal post-processing for formatting fixes.
43 9
seems like an incidental information leak. The writers clearly thought this information was public, it definitely is not.
6
43 likes 9 reposts

More like this

×