Highlights:
- The company states that Mistral 7B can produce prose, summarize documents, and execute various text processing functions.
- Mistral AI claims its model can perform similarly to larger neural networks while requiring fewer hardware resources.
An open-source language model with 7 billion parameters was just released by Mistral AI, a well-funded artificial intelligence startup founded five months ago.
The model, named Mistral 7B in reference to its parameter count, is accessible on GitHub under the Apache 2.0 license. The company states that it can be employed for research as well as commercial applications.
In May, former researchers from Meta Platforms Inc. and Google LLC founded the Paris-based Mistral AI. Before launching the business, Chief Executive Officer Arthur Mensch worked at the machine learning division of the search engine giant, DeepMind. The open-source Llama language model from Meta was developed under the direction of Chief Science Officer Guillaume Lample.
Mistral AI closed around €105 million (USD 110 million) funding round at a €240 million (USD 253.32 million) valuation four weeks after its May launch. Lightspeed Venture Partners, Index Ventures, Redpoint Ventures, and more than a half-dozen other backers all contributed to the investment. At the time, Mistral AI stated that it intended to release its first language models in 2024.
The recent release of the Mistral 7B language model suggests that the development effort is moving more quickly than anticipated. The business explained in a blog post that the model’s development took three months. During that time, the creators of Mistral AI put together an engineering team and created the so-called MLOps stack, a group of specialized software tools used to create neural networks.
According to the manufacturer, Mistral 7B can create prose, summarize documents, and perform other text-processing operations. It can also automatically complete developer-written software code. The model has an 8k context length, so each prompt that users enter can hold up to 8,000 tokens.
Mistral AI has 7 billion parameters in its architecture. These configuration options control how a neural network approaches data processing. Hundreds of millions of these settings are in the most advanced AI systems available nowadays.
Mistral 7B, according to the manufacturer, “outperforms all currently available open models up to 13B parameters on all standard English and code benchmarks.” This includes the Llama 2 advanced language model, which Meta released earlier this year and has 13 billion parameters. Additionally, Mistral 7B’s performance was “on par” with a 34 billion parameter version of Meta’s Llama model, an earlier version of Llama 2.
According to Mistral AI, its model can perform on par with larger neural networks while requiring less hardware. Reducing an AI’s hardware requirements boosts performance while also lowering operating costs. Therefore, the company anticipates Mistral 7B to be used for use cases that require low latency.
The business intends to release a number of large language models, the first of which is Mistral 7B. The upcoming additions to the lineup are anticipated to support more languages and perform reasoning tasks better. Long-term plans for Mistral AI include hosting neural networks for businesses.