Highlights:
- AI21 states that Maestro automatically enforces user-defined constraints, reducing the need for manual coding.
- In internal testing, AI21 used Maestro with several popular LLMs and found that it can improve model accuracy by up to 50% in some cases.
AI21 Labs Ltd. introduced Maestro, a software system that assures bolstering the output quality of large language models effectively.
The Israel-based AI21 is an AI startup, backed by USD 336 million in funding from Nvidia Corp., Google LLC, and other investors, specializes in enterprise-focused LLMs known as Jamba. These models can process prompts of up to 256,000 tokens and incorporate retrieval-augmented generation (RAG), a technique that enables AI to analyze information beyond its training data.
Before deploying an LLM in production, enterprises typically implement measures to minimize output quality issues. This often involves setting up software workflows that automatically detect and correct errors in AI-generated responses. While effective in reducing hallucinations, these workflows can be complex to develop and maintain.
AI21’s newly launched Maestro platform aims to streamline this process. Described as an AI planning and orchestration system, it reduces the effort required to mitigate LLM output errors while simplifying several related tasks.
To use Maestro, users provide a prompt along with specific requirements that must be met during processing. For instance, they can set a limit on the cost of generating an LLM response. AI21 states that Maestro automatically enforces these user-defined constraints, reducing the need for manual coding.
When handling complex prompts, Maestro breaks the task into smaller substeps—a method proven to enhance LLM response quality. Once the process is complete, the system runs simulations to determine the most efficient way to structure the request for accurate results.
Maestro evaluates multiple processing strategies and selects the one most likely to produce a correct response. If needed, it can also allocate additional computational resources during inference to improve the accuracy of reasoning-optimized LLMs.
After generating a response, Maestro checks for errors and logs each step of the process. This log enables users to review and verify how the response was produced, ensuring greater transparency and accuracy.
In internal testing, AI21 used Maestro with several popular LLMs and found that it can improve model accuracy by up to 50% in some cases. According to the company, reasoning-optimized LLMs like o3-mini can correctly answer more than 95% of prompts when integrated with Maestro.
AI21 envisions Maestro being applied across various use cases. The system can enhance LLMs’ ability to analyze complex documents, respond to user inquiries more accurately, and automate repetitive business tasks such as data entry.
“Mass adoption of AI by enterprises is the key to the next industrial revolution,” said AI21 Co-Chief Executive Officer Ori Goshen. “AI21’s Maestro is the first step toward that future – moving beyond the unpredictability of available solutions to deliver AI that is reliable at scale.”
Maestro is currently in its early access phase, with AI21 planning a broader release later this year.