If we are in the midst of a “moment” that is changing history, then chat is one of its first popular applications.
The birth of chatbots is inseparable from large language models, which are pre-trained based on large-scale datasets and can recognize, summarize, translate, predict, and generate text and other forms of content. Such models can run locally on PCs and workstations powered by NVIDIA GeFce and RTX.
Large language models excel at summarizing large amounts of text, extracting insights through data classification and mining, and generating new text in a user-specified style, tone, or format. They can facilitate communication in a variety of languages, even non-conventional “languages” other than human, such as computer code or protein and gene sequences.
The first generation of large language models could only process text, but subsequent iterations were trained on other types of data. These multimodal large language models can recognize and generate images, text, and other content forms.
Chatbots like ChatGPT were one of the first technology applications to bring large language models to consumers, providing a familiar interface that can converse and respond to natural language prompts. Since then, large language models have been used to help write code and assist scientists in advancing drug discovery and vaccine development.
However, the computing power requirements of many AI models cannot be underestimated. Combining various advanced optimization techniques and algorithms (such as quantization) with RTX GPUs built specifically for AI, large language models can be "pruned" so that they can run locally on the PC without an internet connection. The emergence of new lightweight large language models such as Mistral (one of the large language models that supports Chat with RTX) has reduced the demand for computing power and storage space.
Why are large language models important?
Large language models have wide applicability and can be used in a variety of industries and workflows. With this versatility and its inherent high speed, large language models can bring performance and efficiency improvements to almost all language-based tasks.
DeepLearning running on NVIDIA GPUs in the cloud
Providing accurate translation services through AI.
Large language models like DeepL are widely used in language translation because they use AI and to ensure the accuracy of the output.
Medical researchers are using textbooks and other medical data to train big language models to improve patient care. Retailers are using chatbots powered by big language models to provide users with excellent customer support experiences. Financial analysts are using big language models to transcribe and summarize earnings calls and other important meetings. And these are just the tip of the iceberg of how big language models are being used.
Chatbots like Chat with RTX and writing assistants built on big language models are making their mark in all aspects of knowledge work, whether it’s content marketing, copywriting, or legal tasks. Coding assistants are one of the first applications supported by big language models, heralding a future of AI-assisted software development. Currently, projects such as ChatDev combine big language models with AI entities (intelligent robots that can autonomously help answer questions or perform tasks) to build AI-driven virtual software that can provide services on demand. Users only need to tell the system what application is needed and watch the system do its work.
As easy as daily conversation
Many people’s first exposure to generative AI is through chatbots such as ChatGPT, which simplify the use of large language models through natural language, where users only need to tell the model what to do.
Chatbots powered by large language models can help draft marketing copy, provide vacation recommendations, write customer service emails, and even compose poetry.
The progress made by large language models in image generation and multimodality has expanded the application areas of chatbots, adding the ability to analyze and generate images while retaining a simple and easy-to-use user experience. Users only need to describe the image to the robot or upload a photo and ask the system to analyze it. In addition to chatting, images can also be used as visual aids.
Future technological advances will help large language models expand their capabilities in logic, reasoning, mathematics, etc., giving them the ability to decompose complex requests into smaller subtasks.
There has also been progress in AI agents, which can take complex prompts, break them down into smaller prompts, and autonomously collaborate with large language models and other AI systems to complete the tasks assigned by the prompts. ChatDev is a typical AI agent, but it does not mean that the agent can only be used for technical tasks.
For example, a user could ask a personal AI travel agent to book an international vacation for the family. The agent could break down the task into multiple subtasks, including itinerary planning, booking tours and accommodation, creating a packing list, and finding a dog walker, and then perform each task independently in sequence.
Unlock your personal data with RAG
While large language models and chatbots are powerful in general use cases, they become even more useful when combined with data about individual users, helping to analyze emails for trends, comb through a dense user manual to find the answer to a technical question, or synthesize and analyze years of bank and credit card statements.
Retrieval-augmented generation (RAG) is one of the simplest and most effective methods to connect a specific dataset with a large language model.
Example of RAG on PC.
RAG can improve the accuracy and reliability of generative AI models by leveraging facts obtained from external sources. By connecting the Big Language Model to almost any external resource, users can "talk" to the data warehouse through RAG, while the Big Language Model can also directly quote sources through RAG. The user experience is as simple as pointing the chatbot to a file or directory.
For example, a standard big language model has common sense about content strategy best practices, marketing tactics, and basic insights into specific industries or customer groups. However, if it is connected to the marketing assets for publication through RAG, the big language model will be able to analyze the content and help plan a tailored strategy.
RAG works with any large language model, as long as the application supports RAG. NVIDIA Chat with RTX is a demonstration of connecting a large language model to a personal dataset through RAG. It runs natively on systems equipped with GeForce RTX GPUs or NVIDIA RTX professional GPUs.
Experience the speed and privacy of Chat with RTX
Chat With RTX is a personalized chatbot demo application that can run locally. It is easy to use and free. It is built on RAG and supports nsorRT-LLM and RTX acceleration. Chat With RTX supports multiple open source large language models, including Llama 2 and Mistral. Support for Google's Gemma model will be provided in subsequent updates.
Chat with RTX is available through RAG
Connect users with their personal data.
Users can easily connect local files on their PC to a supported large language model by placing the files in a folder and pointing Chat With RTX to that folder location. Chat With RTX can then quickly answer queries with relevant responses.
Chat with RTX runs on Windows on GeForce RTX PCs and NVIDIA RTX Workstations, so it’s fast and your data stays local. Chat with RTX doesn’t rely on cloud-based services, so you can work with sensitive data on your local PC, without having to share data with third parties or connect to the internet.
Review editor: Liu Qing
Previous article:HPMicro's first battle with RoboMaster, PIE team showed its prowess
Next article:Zhongkong Technology and Jiazhi Technology signed a strategic cooperation agreement to accelerate the layout of "AI+Robots"
- Popular Resources
- Popular amplifiers
- Using IMU to enhance robot positioning: a fundamental technology for accurate navigation
- Researchers develop self-learning robot that can clean washbasins like humans
- Universal Robots launches UR AI Accelerator to inject new AI power into collaborative robots
- The first batch of national standards for embodied intelligence of humanoid robots were released: divided into 4 levels according to limb movement, upper limb operation, etc.
- New chapter in payload: Universal Robots’ new generation UR20 and UR30 have upgraded performance
- Humanoid robots drive the demand for frameless torque motors, and manufacturers are actively deploying
- MiR Launches New Fleet Management Software MiR Fleet Enterprise, Setting New Standards in Scalability and Cybersecurity for Autonomous Mobile Robots
- Nidec Drive Technology produces harmonic reducers for the first time in China, growing together with the Chinese robotics industry
- DC motor driver chip, low voltage, high current, single full-bridge driver - Ruimeng MS31211
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Withstand voltage and leakage current
- Try STM32MP157A-DK1 for free. Come and try it out!
- 【Silicon Labs Development Kit Review】+ Environment Construction and Understanding SSV5
- EEWORLD大学堂----TI DLP? Labs - Automotive - DLP for head-up display applications
- Shanghai ACM32F070 development board evaluation, unboxing, environment construction, lighting up the LED light, making the LED flash
- I'm a complete novice. Why can't I read data from the serial port?
- Entrepreneurship Hero Post (Partners for technology-based startups)
- Retiming in Vivado Synthesis Operations
- Short circuit, help needed
- A brief description of the stack structure in the ZigBee standard