Highlights:
- Nvidia confirmed that the personalized assistants will be able to address similar inquiries typically posed to ChatGPT, such as requesting restaurant recommendations and more.
- Customizability is the main benefit of a local chat assistant. Their interaction experience can be customized by controlling the content they can access to generate responses.
Nvidia Corp. is again leading the way in artificial intelligence, introducing a groundbreaking feature known as Chat with RTX. This feature empowers users to craft their personal artificial intelligence (AI) assistant directly on their laptop or personal computer, bypassing the need for cloud-based solutions.
Nvidia Corps’ Chat with RTX had a free technology showcase, highlighting its capability for users to access customized AI functionalities directly from their devices. Additionally, these offerings harness retrieval-augmented generation (RAG) techniques and Nvidia’s TensorRT-LLM software. Despite this advanced technology, it’s designed to operate efficiently on computing resources, ensuring users experience no discernible decrease in their machine’s performance.
Furthermore, Chat with RTX operates on the user’s machine, so all conversations remain private. This ensures that no one else will have access to or knowledge of the discussions held with their personal AI chatbot. Until now, generative AI chatbots like ChatGPT have primarily been confined to the cloud, operating on centralized servers powered by Nvidia’s graphics processing units (GPUs).
With Chat with RTX, this paradigm shifts, allowing generative AI to operate locally, leveraging the computing power of the GPU within the user’s computer. To utilize this feature, users will require a laptop or PC equipped with a GeForce RTX 30 Series GPU or a more recent model, such as the newly introduced RTX 2000 Ada Generation GPU. Additionally, they’ll need to have a minimum of 8 gigabytes of video random-access memory (VRAM).
The primary benefit of having a local chat assistant is the ability for users to personalize it according to their preferences. They can control the type of content they can access to generate responses, ensuring a tailored and customized interaction experience. Additionally, users can enjoy the privacy advantages and faster response generation, as there’s no latency typically associated with cloud-based solutions.
Chat with RTX utilizes RAG techniques, allowing it to enhance its foundational knowledge by incorporating additional data sources, such as local files stored on the user’s computer. Furthermore, the TensorRT-LLM and Nvidia RTX acceleration software provide a significant speed enhancement to the overall performance. Moreover, Nvidia mentioned that users can select from various underlying open-source Large Language Model (LLM) options, such as Llama 2 and Mistral.
In Nvidia’s Chat with RTX, the personalized assistants can address similar inquiries typically posed to ChatGPT, such as requesting restaurant recommendations. Moreover, it will offer contextual responses when needed, directing users to the pertinent file from which it derived the information.
In addition to accessing local files, Chat with RTX users can specify the sources they want the chatbot to utilize on platforms like YouTube. Users can instruct their personal chat assistant to offer travel recommendations solely based on the content of their favorite YouTubers, tailoring the suggestions to their specific preferences and interests.
Furthermore, users will need to be operating on Windows 10 or 11 and ensure they have the latest Nvidia GPU drivers installed on their device to fully utilize Chat with RTX’s capabilities.
Developers will also experiment with Chat with RTX through the TensorRT-LLM RAG reference project available on GitHub. The company is hosting a Generative AI on Nvidia RTX contest for developers, encouraging them to submit applications that harness the technology’s capabilities. Prizes for the competition include a GeForce RTX 4090 GPU and an exclusive invitation to the 2024 Nvidia GTC conference scheduled in March.
Holger Mueller of Constellation Research Inc. remarked that with Chat with RTX, Nvidia is shifting its focus away from the cloud and data center, aiming to establish itself as a PC software platform. He explained, “It provides the key benefits of privacy, flexibility, and performance for generative artificial intelligence applications that can run locally on the machine. For Nvidia, this is primarily about developer adoption, and that is a smart move as the biggest winners in the AI race will be the software platforms that have the most developers using them.”