Unveiling the chatbot’s “brain” – the Big Language Model

Publisher:RadiantEnergyLatest update time:2024-04-17 Source: NVIDIA英伟达Author: Lemontree Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

If we are in the midst of a “moment” that is changing history, then chat is one of its first popular applications.

The birth of chatbots is inseparable from large language models, which are pre-trained based on large-scale datasets and can recognize, summarize, translate, predict, and generate text and other forms of content. Such models can run locally on PCs and workstations powered by NVIDIA GeFce and RTX.

Large language models excel at summarizing large amounts of text, extracting insights through data classification and mining, and generating new text in a user-specified style, tone, or format. They can facilitate communication in a variety of languages, even non-conventional “languages” other than human, such as computer code or protein and gene sequences.

The first generation of large language models could only process text, but subsequent iterations were trained on other types of data. These multimodal large language models can recognize and generate images, text, and other content forms.

Chatbots like ChatGPT were one of the first technology applications to bring large language models to consumers, providing a familiar interface that can converse and respond to natural language prompts. Since then, large language models have been used to help write code and assist scientists in advancing drug discovery and vaccine development.

However, the computing power requirements of many AI models cannot be underestimated. Combining various advanced optimization techniques and algorithms (such as quantization) with RTX GPUs built specifically for AI, large language models can be "pruned" so that they can run locally on the PC without an internet connection. The emergence of new lightweight large language models such as Mistral (one of the large language models that supports Chat with RTX) has reduced the demand for computing power and storage space.

Why are large language models important?

Large language models have wide applicability and can be used in a variety of industries and workflows. With this versatility and its inherent high speed, large language models can bring performance and efficiency improvements to almost all language-based tasks.

DeepLearning running on NVIDIA GPUs in the cloud

Providing accurate translation services through AI.

Large language models like DeepL are widely used in language translation because they use AI and to ensure the accuracy of the output.

Medical researchers are using textbooks and other medical data to train big language models to improve patient care. Retailers are using chatbots powered by big language models to provide users with excellent customer support experiences. Financial analysts are using big language models to transcribe and summarize earnings calls and other important meetings. And these are just the tip of the iceberg of how big language models are being used.

Chatbots like Chat with RTX and writing assistants built on big language models are making their mark in all aspects of knowledge work, whether it’s content marketing, copywriting, or legal tasks. Coding assistants are one of the first applications supported by big language models, heralding a future of AI-assisted software development. Currently, projects such as ChatDev combine big language models with AI entities (intelligent robots that can autonomously help answer questions or perform tasks) to build AI-driven virtual software that can provide services on demand. Users only need to tell the system what application is needed and watch the system do its work.

As easy as daily conversation

Many people’s first exposure to generative AI is through chatbots such as ChatGPT, which simplify the use of large language models through natural language, where users only need to tell the model what to do.

Chatbots powered by large language models can help draft marketing copy, provide vacation recommendations, write customer service emails, and even compose poetry.

The progress made by large language models in image generation and multimodality has expanded the application areas of chatbots, adding the ability to analyze and generate images while retaining a simple and easy-to-use user experience. Users only need to describe the image to the robot or upload a photo and ask the system to analyze it. In addition to chatting, images can also be used as visual aids.

Future technological advances will help large language models expand their capabilities in logic, reasoning, mathematics, etc., giving them the ability to decompose complex requests into smaller subtasks.

There has also been progress in AI agents, which can take complex prompts, break them down into smaller prompts, and autonomously collaborate with large language models and other AI systems to complete the tasks assigned by the prompts. ChatDev is a typical AI agent, but it does not mean that the agent can only be used for technical tasks.

For example, a user could ask a personal AI travel agent to book an international vacation for the family. The agent could break down the task into multiple subtasks, including itinerary planning, booking tours and accommodation, creating a packing list, and finding a dog walker, and then perform each task independently in sequence.

Unlock your personal data with RAG

While large language models and chatbots are powerful in general use cases, they become even more useful when combined with data about individual users, helping to analyze emails for trends, comb through a dense user manual to find the answer to a technical question, or synthesize and analyze years of bank and credit card statements.

Retrieval-augmented generation (RAG) is one of the simplest and most effective methods to connect a specific dataset with a large language model.

Example of RAG on PC.

RAG can improve the accuracy and reliability of generative AI models by leveraging facts obtained from external sources. By connecting the Big Language Model to almost any external resource, users can "talk" to the data warehouse through RAG, while the Big Language Model can also directly quote sources through RAG. The user experience is as simple as pointing the chatbot to a file or directory.

For example, a standard big language model has common sense about content strategy best practices, marketing tactics, and basic insights into specific industries or customer groups. However, if it is connected to the marketing assets for publication through RAG, the big language model will be able to analyze the content and help plan a tailored strategy.

RAG works with any large language model, as long as the application supports RAG. NVIDIA Chat with RTX is a demonstration of connecting a large language model to a personal dataset through RAG. It runs natively on systems equipped with GeForce RTX GPUs or NVIDIA RTX professional GPUs.

Experience the speed and privacy of Chat with RTX

Chat With RTX is a personalized chatbot demo application that can run locally. It is easy to use and free. It is built on RAG and supports nsorRT-LLM and RTX acceleration. Chat With RTX supports multiple open source large language models, including Llama 2 and Mistral. Support for Google's Gemma model will be provided in subsequent updates.

Chat with RTX is available through RAG

Connect users with their personal data.

Users can easily connect local files on their PC to a supported large language model by placing the files in a folder and pointing Chat With RTX to that folder location. Chat With RTX can then quickly answer queries with relevant responses.

Chat with RTX runs on Windows on GeForce RTX PCs and NVIDIA RTX Workstations, so it’s fast and your data stays local. Chat with RTX doesn’t rely on cloud-based services, so you can work with sensitive data on your local PC, without having to share data with third parties or connect to the internet.



Review editor: Liu Qing

Reference address:Unveiling the chatbot’s “brain” – the Big Language Model

Previous article:HPMicro's first battle with RoboMaster, PIE team showed its prowess
Next article:Zhongkong Technology and Jiazhi Technology signed a strategic cooperation agreement to accelerate the layout of "AI+Robots"

Latest robot Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号