Meta's Llama LLMs Spark Debate Over Open Source AIMeta's Llama LLMs Spark Debate Over Open Source AI
As Meta positions its Llama large language models as "open source AI," critics challenge whether restrictive usage terms and undisclosed training data truly fulfill open source principles in the evolving artificial intelligence landscape.

The modern large language models (LLMs) developed by OpenAI are noticeably not open source, despite the company's name. Neither are most of the major LLMs from companies like Google or Anthropic.
But there's another key player in the AI ecosystem that has placed a major bet on open source LLMs: Meta, which has committed to open source AI by releasing its Llama models to the public.
At least, that's how Meta would like to position its AI strategy. Some of Meta's critics, however, have a different take. They claim that the Llama models fall short of meeting the definition of open source "by any stretch of the imagination," as the Open Source Initiative (OSI) puts it.
Even if you don't plan on using the Llama models, the controversy surrounding whether Meta's LLMs are truly open source is worth noting. It poses some key questions about what open source means in the context of AI and suggests how the open source ecosystem must adapt if it wants to encourage development of open source LLMs.
What Makes Meta's LLMs Open Source?
In Meta's view, the Llama models, which the company began releasing in early 2023, are open source because of the following characteristics:
The models' code is freely available.
Anyone can use the models for most research and commercial purposes. (The key word there is "most"; we'll discuss the exceptions in a bit.)
The models can be customized through fine-tuning.
These characteristics distinguish the Llama models from most other major LLMs, whose code is not open and which can only be used for use cases that their vendors explicitly approve.
Are the Llama Models Actually Open Source?
Meta hasn't just released a set of LLMs that it calls open source. It has also invested heavily in messaging related to those models. On a microsite devoted to showcasing its open source AI initiatives, for example, the company proclaims that it is committed to "making innovation available to all" by ensuring that anyone can access open source AI technology.
That sounds nice, and it's a compelling argument in an era where a handful of very large companies (like Microsoft and Amazon) or startups closely aligned with them (like OpenAI and Anthropic) dominate the AI space.
The problem, though, is that Meta's definition of open source AI contradicts the way other groups — like the OSI, a nonprofit that advocates for the use of open source software and has long been an institutional presence in the open source ecosystem — define open source AI.
Specifically, the OSI says that Meta's models "fail in spectacular ways" at qualifying as open source, due mainly to the following limitations:
The Llama licenses prohibit use (without Meta's explicit permissions) of the models by companies that have more than 700 million monthly active users — which is another way of saying "companies that compete with Meta," because it's typically only other large tech companies that operate at that scale.
The Llama licenses discriminate against certain groups by not allowing the use of the models by large companies. According to the OSI, open source licenses cannot include terms that discriminate.
Other critics have noted that Meta hasn't released the data it used to train its LLMs, contending that this is another factor that contradicts the true definition of open source AI. For its part, the OSI hasn't said that training data needs to be released for a model to qualify as open source, but it does say that developers should describe their training data and identify where to find it if they want to release an open source LLM.
What Counts as Open Source AI, Exactly?
It may be tempting for some to write off Meta's Llama open source claims as a form of "open source washing" — meaning attempts by a company to enhance its brand image by purporting to care about open source while making few meaningful contributions to the open source ecosystem. If Meta really cared about open source AI, it might have chosen to release its models under terms developed by the open source community, such as the OSI's open source AI definition, rather than writing its own licensing rules and choosing to call them open source.
But a more charitable view is that it's truly hard to create a universal definition of open source AI technology. In the absence of stronger consensus about what counts as an open source LLM, everyone — including Meta — is entitled to their own reasonable interpretation.
After all, when free and open source software emerged in the 1980s, developers couldn't imagine the era of the cloud, let alone a world where large language models trained on vast quantities of data would become key IT technologies. For decades, open source licenses focused on keeping software source code open to the public. They didn't address questions like allowable use cases or whether training data used for machine learning should be accessible along with source code because no one had yet thought to ask those questions.
Now that those questions have become relevant, it's perhaps time for the open source community to develop new licenses to govern AI projects. The OSI's open source AI definition is just that — a definition, not a license. It clarifies some questions about open source AI from the perspective of one organization, but it doesn't provide a ready-made AI software license similar to the GNU General Public License (GPL).
To avoid ongoing conflict between companies willing to invest in open source AI and the open source community, the world needs an AI-centric counterpart to the GPL.
About the Author
You May Also Like