On July 19th Meta has unveiled Llama 2, an enhanced version of its language model in a surprise collaboration with Microsoft. Soon, it will be available on the Microsoft Azure platform and Amazon SageMaker, offering licensing options for both research and commercial use.
The latest versions, including the 7B, 13B, and 70B models, feature an impressive 40% rise in pre-trained data. This expansion makes use of a broader context for training and incorporates GQA (Generalised Question-Answering) to improve the larger model’s inferencing ability.
Recently, many companies have debuted their own language models. Some examples include TII’s Falcon, Stanford’s Alpaca, Vicuna-13B, and Anthropic’s Claude 2. But, instead of getting lost in headlines claiming Llama’s superiority, it’s vital to understand how these models truly measure up.
Evaluating Llama 2-Chat
Fine-tuning and reinforcement learning methods were applied, utilizing human feedback. This involved gathering preference data and training reward models, with innovative techniques like Ghost Attention (GAtt) being introduced. It even took inspiration from GPT-4’s outputs.
A study was conducted by Meta to determine Llama-2’s efficacy using 4,000 prompts, comparing it to both open-source and proprietary models like ChatGPT and PaLM. The results indicate that the 70B Llama-2 version is comparable to GPT-3.5–0301, outshining Falcon, MPT, and Vicuna.
When rated on “helpfulness”, Llama 2-Chat surpassed its competitors, even achieving a win rate of 75% against Vicuna-33B and Falcon 40B. However, in coding tasks, Llama-2 lags behind models like GPT-3.5, GPT-4, and upcoming GPT-5 Model.
A Look at Claude-2
Claude 2 AI stands out in coding, math, and logical reasoning, even handling tasks such as interpreting PDFs – something GPT-4 struggles with. When it comes to literary prowess, while ChatGPT adopts a sophisticated tone, Llama-2 tends to produce simpler, rhyme-focused poetry.
Meta took steps to ensure high-quality training data for Llama-2, realizing the limitations of public data. The outcomes have been applauded by early users. Llama’s development reportedly set Meta back by over USD 20 million. Despite its “open-source” label, it’s offered with conditions, including stipulations for high user counts. This follows a prior incident where the first Llama model was leaked online.
Other models like GPT-4 and Claude-2, while not open-source, provide access via APIs.
The Microsoft-Meta Collaboration
Microsoft’s recent partnership with Meta raised eyebrows, especially after its longstanding relationship with OpenAI. In contrast to OpenAI’s GPT-4, which faced criticism for not providing adequate details in its release paper, Meta’s documentation on Llama-2 is detailed and transparent. Industry experts suggest Llama-2 could be a formidable competitor to OpenAI’s offerings.
With Llama-2’s introduction, Microsoft has diversified its portfolio, securing a backup should their initial investment falter.
Read more related topics: