Highlights:
- The company provides Retrieval-Augmented Generation (RAG) as a service.
- Numerous studies indicate that concerns about bias, inaccuracy, and contextually irrelevant responses, known as hallucinations, are major obstacles to the broader deployment of externally facing generative AI models by companies.
Vectara Inc., the developer of a platform that facilitates organizations’ training of generative AI models with their data, has recently integrated a factual consistency score metric into all generated responses. Vectara’s anti-hallucination features aim to evaluate response precision.
The company provides Retrieval-Augmented Generation (RAG) as a service, which is a technique utilized in natural language processing to improve the quality, relevance, and precision of generated text. This is achieved by initially retrieving pertinent information from extensive datasets or knowledge bases and incorporating it into the generation process. With RAG, organizations can enhance their models to produce responses tailored to their own documents and databases rather than relying solely on public information.
The company is tackling a significant obstacle hindering the wider adoption of generative AI within enterprises. Extensive research has highlighted that concerns regarding bias, inaccuracies, and contextually irrelevant responses, often referred to as “hallucinations,” are primary deterrents preventing companies from deploying generative AI models for external use on a broader scale. According to Vectara’s published findings, hallucination rates range from 3% to 16.2%, varying based on the specific large language model employed.
Vectara alleviates this uncertainty for enterprises by furnishing a scoring system that evaluates the probability of a generated response being a hallucination. This system is built upon an enhanced iteration of the Hughes Hallucination Evaluation Model developed by the company.
The HHEM evaluates the original source document against the summary produced by the LLM, assigning a score ranging from 0 to 1. A score of 0 denotes complete hallucination, while a score of 1 signifies flawless factual consistency.
The score evaluates a direct probability; for instance, a score of 0.98 denotes a 98% likelihood of factual consistency. Most other classifiers, according to the company, ignore calibration. Users are granted the flexibility to establish action thresholds by being able to set response acceptance thresholds based on a comprehensive accuracy score.
Vectara claimed that its HHEM ranks as the top hallucination detection model on Hugging Face, boasting over 100,000 downloads within a span of five months.