Highlights:
- Neo4j stated that the collaboration will empower Azure users to organize unstructured data and seamlessly load it into a knowledge graph.
- Developers can utilize Azure OpenAI embedding APIs to generate embeddings and store them in the Neo4j database.
Neo4j Inc. and Microsoft Corp. unveiled a collaboration to integrate graph database functionalities into Microsoft’s Fabric and Azure OpenAI services. The Neo4j and Microsoft collaboration aims to assist users in uncovering patterns and relationships in structured and unstructured data that may not be as readily apparent in other data stores.
The Azure OpenAI Service offers access to managed versions of various OpenAI language models. Fabric integrates existing Microsoft platforms, including Data Factory, Synapse, and Power BI, into a unified software-as-a-service product.
Neo4j stated that the collaboration will empower Azure users to organize unstructured data and seamlessly load it into a knowledge graph. Subsequently, users can leverage Neo4j tools like Bloom data visualization or the PowerBI connector for business intelligence.
Top of Form
For example, developers can use Azure Data Factory to ingest data from Microsoft’s OneLake data lake into Neo4j, extract data from the Azure Synapse Data Warehouse using the Neo4j data warehouse connector, run graph data science algorithms from Synapse data science notebooks, and build interactive dashboards in PowerBI using Neo4j Knowledge Graphs.
New Take on RAG
Azure users will also gain access to GraphRAG, which leverages a large language model to generate a knowledge graph from a private dataset. GraphRAG represents an advanced iteration of retrieval-augmented generation. It surpasses conventional RAG for questions that traverse disparate pieces of information by leveraging shared attributes and necessitating a comprehensive understanding of summarized semantic concepts across extensive data collections.
A typical use case for GraphRAG involves a search that combines an unstructured document, like an automobile engine manual, with a structured bill of materials outlining the parts. “You can do a dual search between a vectorized document search plus a graph search and blend it into one single answer,” declared Sudhir Hasbe, the Chief Product Officer of Neo4j.
Neo4j offers long-term memory for LLMs, supporting native vector embeddings. These embeddings are numerical representations of usually non-numerical data objects, concisely capturing their inherent properties and relationships. These vector embeddings have extensive applications in natural language processing and recommendation systems.
Additionally, the software supports vector storage and search capabilities. By utilizing the Azure OpenAI embedding APIs, developers have the ability to produce embeddings that can be stored in the Neo4j database.
Hasbe said, “I can take a large document, chunk it using various technologies, use Azure OpenAI to create vector embeddings, and Neo4j can store it at every node or relationship level.” He demonstrated an example wherein the Wikipedia entry for Microsoft Chief Executive Satya Nadella was parsed, showcasing relationships between elements in the biography as graph nodes.
He added, “We provide similar capabilities to any vector database now, with vector store and vector search built into the database. We see more and more production workloads blending vectors with knowledge graphs in a single solution.”
Neo4j and Microsoft Fabric are collaborating on a project to deliver the graph database as a native workload for Graph Analytics on Microsoft Fabric. Hasbe mentioned that the companies aim to unveil a preview of the new capabilities during the second quarter of 2024.