Nvidia Introduces Chat With RTX for Running LLM on Your PC

Nvidia has just published Chat with RTX, which allows you to run a chatbot based on a large language model locally on a PC with a sufficiently powerful graphics card. Three models are offered by default: that of Nvidia, trained by the company on a vast database of public documents, Llama 2 13B, and Mistral 7B.

Running an LLM locally with ease

Running a model of this type locally is not new (we are thinking in particular of LM Studio), but Chat with RTX simplifies things further by allowing users to simply select a directory full of text documents to train the model on them (.txt, .pdf, .doc/.docx and .xml formats are supported).

Chat with RTX requires a graphics card with a GeForce RTX 30 or better chip with at least 8 GB of RAM. It's also better to have a good CPU and 32 GB of RAM. The system uses the RAG (retrieval-augmented generation) technique for training LLMs with user data. Acceleration is done through TensorRT-LLM. Nvidia indicates that it wants to open the project eventually, allowing companies wishing to appropriate it for their specific uses.

Competition is fierce in the text generator segment and everyone is trying to do well in the game. According to Statista, the number of users of generative AI tools is expected to grow from around 250 million this year, to more than 700 million by the end of the decade. At the end of September 2023, the pioneer ChatGPT was logically at the top of the market share (nearly 20%), closely followed by Jasper Chat (13%), YouChat (12%), DeepL (12%) and Simplified (nearly 10%). %), while several other players shared the remaining third of the market.

Selected for you