Nvidia's Chat with RTX will connect an LLM with YouTube videos and documents locally on your PC

Summary

Nvidia's Chat with RTX allows users to converse with documents and YouTube videos using AI technology, powered by retrieval-augmented generation (RAG).
Minimum requirements for using Chat with RTX include an RTX 30 or RTX 40 series GPU, minimum 8GB vRAM, Windows 10 or Windows 11, latest Nvidia drivers, and up to 100GB of storage.
Users can create datasets, ask questions about documents, and obtain accurate transcripts from YouTube videos using Chat with RTX, providing a faster and more privacy-conscious alternative to watching entire videos.

WIth AI being all the rage in recent months, there has been somewhat of a struggle to find uses that actually make sense for it. While it's a powerful technology, most people don't have a reason to actually need to use it, though there are some key aspects of it that work well in certain industries. Nvidia's Chat with RTX feature is one that strives to give AI a purpose, and in this case, can be used to converse with documents and YouTube videos online.

Nvidia utilizes a technology called retrieval-augmented generation (RAG) to power much of Chat with RTX. It improves prediction accuracy by using an external dataset during inference and lacing the responses with relevant information from the documents that are in its dataset. This also gets around the requirement for LLMs to have "knowledge cut-offs" like people may have experienced with ChatGPT.

Following our brief testing of the feature at this year's CES, we gained access to Chat with RTX over the weekend just gone by and have been playing around with it. There are some hefty requirements to use it though, so it's not for the feint of heart.

Chat with RTX: What you'll need

These are the minimum requirements for Chat with RTX

Nvidia RTX 4080 Super graphuics card signed by Jenseng Huang — Image: Nvidia

To use Chat with RTX, you'll need to meet the following requirements:

An RTX 30 or RTX 40 series GPU
Minimum 8GB vRAM
Windows 10 or Windows 11
The latest Nvidia drivers
Up to 100GB of storage

The advantages of using Chat with RTX include the ability to process data locally, eliminating the need to send potentially personal information to a cloud server. However, this requires having sufficiently powerful hardware to support it.

These are the best graphics cards you can buy

Best GPUs in 2025: Our top graphics card picks

Picking the right graphics card can be difficult given the sheer number of options on the market. Here are the best graphics cards to consider.

Posts

By Rich Edmonds

Setting up Chat with RTX

Takes a while, but fairly straightforward

The Chat with RTX home page on a Windows PC

Unlike other tools that are aimed at enthusiasts that are willing to get knee-deep in terminal commands to play around with AI, Nvidia recognizes that people may simply just want a program that works. Chat with RTX is distributed as a single 37GB zip file, and that zip file, once extracted, will reveal a setup.exe file. Running that will bring you through all of the steps to install Chat with RTX, and at the end of it, will launch it for you in a web browser. It genuinely couldn't be easier.

Once it launches, you'll be directed to a webpage featuring a prompt window, four pre-filled prompts, and an option to select an AI model. While you can skip adjusting the settings and stick with the default options, doing so primarily limits you to querying information supplied by Nvidia... for the most part. In reality, you have the freedom to inquire about any topic, thanks to the default use of Mistral's 7B AI model, which doesn't restrict you to the provided dataset. Despite that, leveraging the full capabilities of Chat with RTX is essentially its primary purpose.

You can also change the AI model to be Meta's Llama 2 model instead if you want, though it requires more resources as it's a 13B model. Mistral 7B also outperforms Meta's 13B Llama 2 model in pretty much every conceivable way, so there's no reason not to use Mistral for better quality answers and lower resource usage. You're good to go to play around with it now, though!

Using Chat with RTX

Creating a dataset

Asking Chat with RTX what's important about Half-Life 2 RTX

The best way to use Chat with RTX is to create a dataset, point it to it, and then ask questions. You can do that by creating a folder anywhere on your PC and filling it with text or PDF documents. These PDF documents can be obtained from anywhere, and Chat with RTX comes with a few text files that you can ask questions about as well. Once you point it at a folder, you'll need to wait a couple of minutes while it generates datapoints from the files in the folder.

This feature works quite well, and is a feature akin to ChatPDF, a service where you can upload a PDF and ask questions about the document. This runs entirely locally on your PC, though, and doesn't require any outside connection. Not only is it better from a privacy standpoint, but it's quicker and free if you own an Nvidia 30 or 40 RTX card, too. It generates text extremely quickly tool, giving the answer in the above screenshot in seconds.

Using a YouTube video

Asking Chat with RTX about a YouTube video comparing the OnePlus 12 and the Google Pixel 8 Pro

Where Chat with RTX excels above anything else I've really seen is in how it can handle YouTube videos. It has support built-in for downloading transcripts from YouTube videos, and can then answer questions about what the video is talking about. While it requires the video to have accurate transcriptions, it's a great way to get the general gist of a video without needing to watch the entire thing.

I used it on one of XDA's YouTube videos, and found that it was quite fast as well. It's definitely significantly quicker than sitting down and watching the entire thing, and you can ask more questions about the video rather than just the conclusion. For anything important, you should verify the answer is correct, though. The XML file with the text from the transcript will go into the youtube_dataset folder under 'trt-llm-rag-windows-main', so you can check that file instead of going through the whole video to verify if you wish.

Downloading Chat with RTX

It's available now

Nvidia has made Chat with RTX available for anyone to download, and you'll have access to the same dataset that we were given here, too. Developers can incorporate RAG in their own applications if they wish, but if you just want to play around with it, then you can go to Nvidia's site to download and try it out. It'll likely get better over time and given that it's based on an open-source project, can be modified by developers and the community to add support for more AI models and other features, too.

Definitely give Chat with RTX a try if you're interested in AI and LLMs. If you want to get a taste for more afterward, be sure to give LM Studio a try, one of the best AI applications out there.

Chat with RTX

Chat with RTX lets you create datasets or query YouTube videos from a locally-run application on any Nvidia RTX 30 or 40 series card. There are a few other requirements too for you to meet, but it's a powerful application that anyone can use for free.

See at Nvidia