Highlights:
- The venture intends to develop a succession of language models, the first of which is StableLM. Future installments in the series are expected to feature more intricate architectures.
- The new StableLM model from Stability AI can perform a comparable set of operations.
StableLM, an open-source language model that can create text and code, was recently released by Stability AI Ltd., an artificial intelligence business.
The venture intends to develop a succession of language models, the first of which is StableLM. Future additions in the series are expected to feature more intricate architectures.
Stability AI, based in London, is supported by USD 101 million in funding. It is best known as the creator of the open-source neural network Stable Diffusion, which can generate images based on text input. A few days before the latest introduction of the StableLM language model, the startup released a significant update to Stable Diffusion.
StableLM is initially available in two versions. The first consists of three billion parameters and the configuration settings determining how a neural network processes data. The second version contains seven billion of these settings.
The more parameters a neural network has, the more tasks it can complete. PaLM, a large language model described by Google LLC last year, is configurable with over 500 billion parameters. It has demonstrated the ability to generate code and text and solve relatively complex mathematical problems.
The new StableLM model from Stability AI can perform comparable operations. However, the startup still needs to disclose specific information regarding the model’s capabilities. Later on, Stability AI intends to publish a technical overview of StableLM.
While the startup did not reveal specific information about StableLM, it did reveal how the model was trained. Stability AI created it using an enhanced version of The Pile, an open-source training dataset. The standard edition of the dataset contains 1.5 trillion tokens, data elements consisting of a few letters each.
StableLM is licensed under the CC BY-SA 4.0 open-source license. The model can be used in research and commercial endeavors, and its code can be modified as needed.
Stability AI stated in a blog post, “We open-source our models to promote transparency and foster trust. Researchers can ‘look under the hood’ to verify performance, work on interpretability techniques, identify potential risks, and help develop safeguards. Organizations across the public and private sectors can adapt (‘fine-tune’) these open-source models for their own applications.”
Stability AI released five StableLM variations trained on datasets other than The Pile. Training a model of artificial intelligence on additional data enables it to incorporate more information into its responses and perform new tasks. The five specialized variants of StableLM might be restricted to use in academic research.
Dolly, a collection of 15,000 chatbot queries and replies, was among the datasets Stability AI used to train the specialized variants of StableLM. Databricks Inc. released Dolly earlier this month. The dataset was used by Databricks to train an advanced language model available under an open-source license, similar to StableLM.
StableLM is in the alpha phase. This is the first language model that Stability AI intends to disclose. The startup plans to create StableLM variants with 15 billion to 65 billion parameters as part of its development plan.