Highlights:

  • Vectara can extract data from a wide range of document types, including Office and PDF files, as well as from code in JSON, HTML, XML, and CommonMark formats.
  • The company unveiled a brand-new feature called “Grounded Generation” that, in order to lessen AI “hallucinations,” supplements queries with information from the company’s own data and facts.

Vectara Inc., a provider of generative AI search platforms, recently disclosed that it had raised USD 28 million in seed funding, led by Race Capital, to give developers access to a new feature that significantly lowers AI errors when producing search results.

Businesses can have intelligent conversations with their own data, such as documents, knowledge bases, and code, using Vectara’s cloud-based conversational generative large language model, “search-as-a-service.” Its proprietary AI is comparable to OpenAI LP’s ChatGPT but utilizes company-specific data. The business offers an application programming interface so programmers can easily access its service and incorporate it into their applications, websites, chatbots, and help desks.

To lessen “hallucinations” caused by artificial intelligence, the company introduced a new feature called “Grounded Generation” that supplements queries with information from the company’s own data. This happens when an LLM, like ChatGPT, confidently provides a response that is insufficient, biased, or completely random but also completely incorrect.

These problems have dogged the industry from the start, with notable examples including Microsoft Corp.’s Bing Chat’s strange behavior and Google LLC’s Bard’s factual error in its very first demo.

Industry LLM makers have tried to address these issues by giving AIs more memory to work with, training datasets more closely with the data they will use, and limiting the answers they can provide, but hallucinations themselves continue to be a problem.

Vectara takes a different angle on the issue with Grounded Generation. It augments the conversational prompt for the AI with data pulled from the company’s data with search rather than training the AI model with data from the company’s documents and knowledge base. As a result, the AI’s responses can be limited by actual answers to the search query, which can then be verified and supported by citations to the company’s records and other data.

In addition to significantly lowering hallucinations, Vectara claims that by combining various search techniques, such as semantic search, Boolean, and exact keyword matching to produce highly relevant results, search effectiveness is also being improved. Even information that has been mistyped has alternative meanings or is in a different language can still be found using the searches.

Amr Awadallah, Co-founder and Chief Executive of Vectara, said, “These new features and capabilities make Vectara’s neural retrieval platform among the best in the world. The breakthroughs our team has accomplished over the last eight months are changing the face of AI and how companies can safely use it to expand and improve their value propositions.”

Vectara can extract data from a wide range of document types, including Office and PDF files, as well as from code in JSON, HTML, XML, and CommonMark formats.

The business emphasized that customer privacy can be easily maintained because the LLM models are not trained on company data, such as indexed data or queries. Vectara can therefore offer companies customizable data retention models, such as those that permit them to discard loaded documents after indexing and not keep the original text or leftover data.

Along with the funding round, the company also established a new strategic board of advisors, which includes members with experience from tried and true veterans in the AI sector from organizations and companies like Cloudera, Stanford University, and Northeastern University. Matei Zaharia, Databricks Inc.’s Co-founder and Chief technologist, will also serve on the board when it is established.

Zaharia said, “Vectara’s new funding round will be instrumental in developing and expanding access to the platform’s breakthrough hybrid search and grounded generative AI features.”