In a significant advancement for the AI developer community, Mistral AI has released Mixtral 8x7B, a cutting-edge sparse mixture of experts model (SMoE) with open weights.
This new model sets a benchmark in AI technology, promising faster and more efficient performance compared to existing models.
magnet:?xt=urn:btih:5546272da9065eddeb6fcd7ffddeef5b75be79a7&dn=mixtral-8x7b-32kseqlen&tr=udp%3A%2F%https://t.co/uV4WVdtpwZ%3A6969%2Fannounce&tr=http%3A%2F%https://t.co/g0m9cEUz0T%3A80%2Fannounce
RELEASE a6bbd9affe0c2725c1b7410d66833e24
— Mistral AI (@MistralAI) December 8, 2023
What Is Mixtral-8x7B?
Mixtral 8x7B, available on Hugging Face, stands out for its high-quality performance and is licensed under Apache 2.0.
The model boasts a range of capabilities, including the ability to handle a context of 32k tokens and support for multiple languages, including English, French, Italian, German, and Spanish.
Mixtral is a decoder-only sparse mixture-of-experts network. Its architecture enables an increase in parameters while managing cost and latency.
Mixtral-8x7B Performance Metrics
The new model is designed to understand better and create text, a key feature for anyone looking to use AI for writing or communication tasks.
It outperforms Llama 2 70B and matches GPT3.5 in most benchmarks, showcasing its efficiency in scaling performances.
The company notes that it surpasses Llama 2 70B in most benchmarks, offering six times faster inference.
Mixtral shows improvements in reducing hallucinations and biases, evident in its performance on TruthfulQA/BBQ/BOLD benchmarks.
It demonstrates more truthful responses and less bias compared to Llama 2, along with more positive sentiments.
Mixtral 8x7B’s proficiency in multiple languages is confirmed through its success in multilingual benchmarks.
Alongside Mixtral 8x7B, Mistral AI releases Mixtral 8x7B Instruct, optimized for instruction following. It scores 8.30 on MT-Bench, making it a top-performing open-source model.
Mixtral can be integrated into existing systems via the open-source vLLM project, supported by Skypilot for cloud deployment. Mistral AI also offers early access to the model through its platform.
This latest addition to the Mistral family promises to revolutionize the AI landscape with its enhanced performance metrics, as shared by OpenCompass.
What makes Mixtral-8x7B stand out is not just its improvement over Mistral AI’s previous version, but the way it measures up to models like Llama2-70B and Qwen-72B.
How To Try Mixtral-8x7B: 4 Demos
You can experiment with Mistral AI’s new model, Mixtral-8x7B, to see how it responds to queries and how it performs compared to other open-source models and OpenAI’s GPT-4.
Please note that, like all generative AI content, platforms running this new model may produce inaccurate information or otherwise unintended results.
User feedback for new models like this one will help companies like Mistral AI improve future versions and models.
1. Perplexity Labs Playground
In Perplexity Labs, you can try Mixtral-8x7B along with Meta AI’s Llama 2, Mistral-7b, and Perplexity’s new online LLMs.
In this example, I ask about the model itself and notice that new instructions are added after the initial response to extend the generated content about my query.
While the answer looks correct, it begins to repeat itself.
The model did provide an over 600-word answer to the question, “What is SEO?”
Again, additional instructions appear as “headers” to seemingly ensure a comprehensive answer.
2. Poe
Poe hosts bots for popular LLMs, including OpenAI’s GPT-4 and DALL·E 3, Meta AI’s Llama 2 and Code Llama, Google’s PaLM 2, Anthropic’s Claude-instant and Claude 2, and StableDiffusionXL.
These bots cover a wide spectrum of capabilities, including text, image, and code generation.
The Mixtral-8x7B-Chat bot is operated by Fireworks AI.
It’s worth noting that the Fireworks page specifies it is an “unofficial implementation” that was fine-tuned for chat.
When asked what the best backlinks for SEO are, it provided a valid answer.
Compare this to the response offered by Google Bard.
3. Vercel
Vercel offers a demo of Mixtral-8x7B that allows users to compare responses from popular Anthropic, Cohere, Meta AI, and OpenAI models.
It offers an interesting perspective on how each model interprets and responds to user questions.
Like many LLMs, it does occasionally hallucinate.
4. Replicate
The mixtral-8x7b-32 demo on Replicate is based on this source code. It is also noted in the README that “Inference is quite inefficient.”
Mistral AI Platform
In addition to the Mixtral-8x7B, Mistral AI announced beta access to its platform services, introducing three text-generating chat endpoints and an embedding endpoint.
These models are pre-trained on open web data and fine-tuned for instructions, supporting multiple languages and coding.
- Mistral-tiny utilizes Mistral 7B Instruct v0.2, operates exclusively in English, and is the most cost-effective option.
- Mistral-small employs Mixtral 8x7B for multilingual support and coding capabilities.
- Mistral-medium features a high-performing prototype model that supports the same languages and coding capabilities as Mistral-small.
The Mistral-embed endpoint features a 1024-dimension embedding model designed for retrieval capabilities.
The API, compatible with popular chat interfaces, offers Python and Javascript client libraries and includes moderation control features.
Registration for API access is open, with the platform gradually moving towards full availability.
Mistral AI acknowledges NVIDIA’s support in integrating TensorRT-LLM and Triton for their services.
Conclusion
Mistral AI’s latest release sets a new benchmark in the AI field, offering enhanced performance and versatility. But like many LLMs, it can provide inaccurate and unexpected answers.
As AI continues to evolve, models like the Mixtral-8x7B could become integral in shaping advanced AI tools for marketing and business.
Featured image: T. Schneider/Shutterstock