Highlights:
- The work of developers is made more difficult by the fact that many conventional databases were not created with AI model embeddings in mind and amp; the Chroma database addresses this issue.
- Chroma claims to use the most recent funding round to develop new features. The startup is preparing a feature that will enable developers to judge whether the database’s retrieved information is pertinent to a particular query, among other additions.
Chroma Inc., a provider of databases, recently disclosed that it has secured USD 18 million in seed funding.
Quiet Capital headed the investment. In addition, executives from over a dozen other tech firms, including Hugging Face Inc. and publicly traded database maker MongoDB Inc., contributed. A pre-seed round of undisclosed size was previously closed by Chroma last year.
Chroma creates an open-source database intended to power applications for artificial intelligence. The database has been downloaded over 35,000 times since it was made available less than two months ago.
Decisions made by AI models are based on a data bank. For instance, a shopping recommendation model might keep a database of the most recent product listings. Information about hacker activity is stored in neural networks designed for cybersecurity tasks.
AI models store their data as abstract mathematical structures known as vectors rather than in their raw form. An embedding is a collection of such vectors. The open-source database for Chroma, also known as Chroma, was created specifically to house the embeddings of AI models.
In a recent blog post, Huber and his founding partner Anton Troynikov wrote, “Developers use Chroma to give LLMs pluggable knowledge about their data, facts, tools, and prevent hallucinations. Many developers have said they want ‘ChatGPT but for my data’ — and Chroma provides the ‘for my data’ bridge through embedding-based document retrieval.”
The ability of embeddings to highlight similarities between the data points they store is one of their most crucial features. For instance, an embedding with vectors representing phones can draw attention to phones with similar prices. On the other hand, it is also possible to identify phones with a sizable price disparity.
The ability of embeddings to draw attention to similarities between data points is crucial to the operation of AI models. For instance, recommendation models produce shopping recommendations by looking up comparable goods based on a user’s purchase. When detecting malware, neural networks search for the network activity that resembles well-known hacking techniques.
Additionally, embeddings allow for the detection of differences between two objects, which is useful for AI applications. In the cybersecurity market, a few AI-powered breach prevention tools work by identifying how users frequently use an application. Then, these tools search for an activity that differs from a customer’s typical access patterns.
The work of developers becomes more complicated because many conventional databases were not created with AI model embeddings in mind. The database created by Chroma addresses that problem. The startup claims that because its platform is specially tailored to store AI embeddings, it can offer relatively straightforward developer experiences.
Specialized algorithms transform the data an AI model consumes into embeddings it can process. Chroma claims that its database has features that make using such algorithms simpler. As a result, software teams now need more manual work.
Chroma supports many algorithms made for open source embedding generation. It can also simplify using various paid tools in the group, such as OpenAI LLC’s cloud-based embedding service. Developers can use their unique algorithms if they have more complex requirements.
Chrome provides an in-memory mode to expedite queries. Databases typically store information on disk or flash storage and only access it from memory when necessary. An in-memory system doesn’t wait to retrieve data from storage; instead, it keeps information in RAM immediately, speeding up computations.
Chroma claims to use the most recent funding round to develop new features. The startup is preparing a feature that will enable developers to judge whether the database’s retrieved information is pertinent to a particular query, among other additions. A managed, commercial version of its database is also being created; it will debut in the third quarter.