Why The Llama 3.1 Announcement Is Huge

Tue July 23, 2024

Today Meta announced a new LLM, Llama 3.1 405B and along with it, a great letter by Mark Zuckerburg about why open source is good for developers, Meta & the world. It might seem redundant, amidst the flood of other AI models being released, but I do think this is a big moment, for 4 reasons.

1. Data Sovereignity

Security is a top concern of CISOs. The concern is that data you type into ChatGPT will be captured by OpenAI and used to train other models, in which case it’ll leak into other people’s chat sessions.

Llama has always been open source. This means that companies can run or train their own models based on Llama without ever sending their data to anyone. It never leaves their walls. An entire class of exploits gone.

Until now, there haven’t been any frontier-quality open source models. But Llama 3.1 405B competes directly with the best — GPT 4o & Claude Sonnet 3.5. Now companies can have both performance and dota sovereignity.

2. Cost

Open source is cheaper. Cost is a big concern around LLMs for many companies. And why not? Nvidia is the most valuable company in the world because they sell GPUs for $40k and keep up with demand. On top of that, companies like OpenAI charge enough to cover not only inference hardware, but also the cost to train future models.

Open source AI saves money for companies because they don’t have to pay the OpenAI tax. Furthermore, they can save money on the Nvidia tax as well.

While expensive GPUs are necessary for training, inference can often be done with cheaper and faster hardware. Apple, AMD and Qualcomm each offer neural accelerators, or CPU modules or extensions to make AI inference fast. These chips sell for far less than a pricey Nvidia H100.

3. Independence

Open source enables companies to be independent. The Mark Zuckerberg letter gives a great example:

Between the way they tax developers, the arbitrary rules Apple applies, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build.

When you build on proprietary services, you’re beholden to their policies, which are not frozen. There’s lots of examples of companies changing their customer-facing policies in a way that hurts customers. With open source, you’re guaranteed to always have access to the current release, worst case.

4. Customizable

We don’t talk about this enough, but there are some WILD things you can do with LLMs if you have access to their inner-workings.

For example:

Representation Engineering — Explain why the LLM said that. Or force an LLM to do something, in a way that can’t easily be bypassed by attackers.
Knowlege unlearning — Target a specific fact and erase it from the LLM.
Schema enforcement — Force an LLM to respond in a specific JSON schema.
Adapters — A way to create a custom model that’s a lot cheaper than fine-tuning. It’s something that can be done on a laptop in a weekend.
Knowledge Distillation — Use a more powerful model (e.g. Llama 3.1) to train a smaller model that has cheaper or faster inference. Basically use an LLM to generate synthetic data. This is great for making models that can run on a phone or an embedded device.

In general, getting access to a model’s internals cracks wide open the full potential. As we saw with open source, it’s hard to predict what will be discovered next when anyone can make an advancement.

Conclusion

Expect Llama 3.1 to cause the AI world to evolve even faster, as companies are no longer beholden to big AI providers like OpenAI or Anthropic. What advance will happen next? I don’t know. It’s exciting times!