Mark Zuckerberg's Meta Says Llama 3 Beats Google's Gemini, Mistral And Jeff Bezos-backed Anthropic's Claude 3, But OpenAI's GPT-4 Is Notably Missing From Its Comparison

Mark Zuckerberg's Meta Platforms Inc. META revealed that its new large language model, Llama 3, has surpassed other AI models in benchmark tests. However, OpenAI's latest flagship model, GPT-4, is notably missing from its comparison.

What Happened: The Llama 3 model, which Meta unveiled on Thursday, has outperformed several other AI models in benchmark tests, including those from Alphabet Inc.‘s GOOGL GOOG Google, Jeff Bezos-backed Anthropic, and Mistral AI.

The new model is expected to be available to cloud providers like Amazon.com Inc.'s AMZN Amazon Web Services (AWS) and model libraries such as Hugging Face soon.

Subscribe to the Benzinga Tech Trends newsletter to get all the latest tech developments delivered to your inbox.

The Llama 3 model, which comes in two sizes with 8B and 70B parameters, has shown significant improvements in text-based responses. It has demonstrated a higher diversity in answering prompts, fewer false refusals, and improved reasoning abilities.

The model also shows a better understanding of instructions and improved code-writing capabilities.

See Also: Perplexity AI Vs. ChatGPT Vs. Google Gemini: Which AI Chatbot Is The Best For You?

Meta’s blog post stated that Llama 3 outperformed similarly sized models like Google’s Gemma and Gemini, Anthropic’s Claude 3, and Mistral 7B in specific benchmarking tests.

Meta’s Llama 3 AI model benchmark comparison | Image credit: Meta

In the MMLU benchmark, which measures general knowledge, the 8B version of Llama 3 outperformed both Gemma 7B and Mistral 7B, while the 70B version slightly edged Gemini Pro 1.5.

See Also: Tesla CEO Elon Musk Disagrees With Nvidia’s Jensen Huang That Anyone Can Make Their Own Neural Network

It is worth noting that Meta did not mention OpenAI‘s GPT-4 in its post. The company also highlighted that benchmark testing AI models, while helpful, is not perfect, as the datasets used for benchmarking are often part of the model’s training.

Why It Matters: Earlier in March, Anthropic introduced its latest generative AI model, Claude 3, claiming it outperforms rivals like OpenAI's GPT-4.

In a similar vein, Elon Musk’s xAI announced a major update to Grok with improvements across the board in various metrics.

OpenAI, on the other hand, is expected to release a "significantly better" GPT-5 model later this year.

Check out more of Benzinga's Consumer Tech coverage by following this link.

Read Next: ‘When Elon Musk Met Jack Ma And Instantly Regretted It’

Disclaimer: This content was partially produced with the help of Benzinga Neuro and was reviewed and published by Benzinga editors.

Photo courtesy: Shutterstock

Market News and Data brought to you by Benzinga APIs
Comments
Loading...
Posted In:
Benzinga simplifies the market for smarter investing

Trade confidently with insights and alerts from analyst ratings, free reports and breaking news that affects the stocks you care about.

Join Now: Free!