Large models connected to robots place high demands on end-side chips

Publisher:SerendipitySoulLatest update time:2024-06-17 Source: 电子发烧友Author: Lemontree Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Enthusiasts.com reports (Text/Li Wanwan) As an important branch of development, embodiment is often mentioned. Simply put, artificial intelligence systems focus on data processing and optimization, like the human brain, while embodied intelligence pays more attention to the interaction and communication between machines and the environment, which is a combination of brain and body.

So what are the embodied intelligent terminals? At the recent Core Original Technology Seminar, Yuan Diwen, Chairman and CEO of Shending (Nanjing) Co., Ltd., showed some examples, such as, humanoid,, low-altitude flight, MR/AR, AGV/AMR, etc. Moreover, Yuan Diwen believes that the development of embodied intelligent terminals requires large models and 3D space computing.

Domestic and foreign technology companies are committed to connecting large models to robots

Since the large model technology entered the public eye at the end of 2022, domestic and foreign technology companies have been actively promoting the technology and application of large models, and connecting large models to robots is also a key research direction for major technology companies and robot companies.

As early as July 2023, the team led by AI scientist Fei-Fei Li released the latest results of embodied intelligence, that is, the large model is connected to the robot to convert complex instructions into specific action plans. Humans can give instructions to robots in natural language at will, and the robot does not need additional data and training.

Fei-Fei Li's team named the system VoxPoser. Compared with traditional methods that require additional pre-training, this method uses a large model to guide the robot how to interact with the environment, which directly solves the problem of scarce robot training data.

It can be seen that with the development of large model technology, over the past year or so, almost all large technology companies that have mastered artificial intelligence technology, from OpenAI to Google DeepMind, have been committed to connecting multi-functional learning algorithms that support chatbots to robots. The purpose of this is to give robots common sense knowledge so that they can handle various tasks.

As reported, as investment in artificial intelligence robots heats up, OpenAI will restart its robotics business and is currently actively recruiting researchers to rebuild the once disbanded robotics team.

The humanoid robot is also attracting much attention. In the early morning of June 14, Tesla held its 2024 shareholders' meeting at its headquarters in Texas, USA. Musk said at the meeting that Tesla will start "limited production" of the humanoid robot Opmus in 2025 and test humanoid robots in its own factory next year. He predicted that Tesla will have "more than 1,000, or even thousands of Optimus robots in operation" next year.

Recently, the domestic robot field has also been active. The world's first pure electric drive full-size humanoid robot "Tiangong" has achieved "anthropomorphic running" for the first time in Beijing Humanoid Robot Innovation, Yushu Technology released a new humanoid robot Unitree G1 targeting the elderly care market, and the humanoid robot Walker S entered the Weilai assembly workshop for "practical training" as an "apprentice factory worker".

iFlytek also recently stated that the company has officially released the "large model + embodied intelligence" humanoid robot technology prototype at the 2023 Global "1024 Festival". On May 31, 2024, the company launched the iFlytek Robotics Super Brain Platform 2.0 project, which will use multi-modal perception interaction with audio-visual fusion and a robot brain based on a large model to build a new robot interaction in a soft-integrated way, and further empower the robotics field with the iFlytek Spark large model.

Over the past six months, with the rapid development of PC large models, mobile phones and PCs can provide a lot of services through large language models, such as smart office assistants, personalized systems, images and processing, health monitoring and management, education and learning, creative writing and content creation. Yuan Diwen talked about the development of large models on mobile phones and PCs at a recent conference, which actually also contributed to the development of robot large models. Because the robot itself is a terminal, the functions it needs include the basic functions of mobile phones and PCs.

However, Yuan Diwen believes that the challenges of robot large models are higher than those of mobile phones and PCs. First, robots need multimodal data processing, so that robots can make comprehensive use of multiple perceptions and fully understand the environment and task requirements; second, autonomous navigation and positioning requirements, robots need to move autonomously, plan routes and avoid obstacles; third, physical space interaction, robots not only process information, but also interact with the physical environment to perform physical tasks such as handling, assembly, and cleaning; fourth, real-time requirements, robots need low latency when performing tasks to ensure the accuracy and timeliness of actions, especially when moving at high speed or performing delicate operations.

Large model robots have higher requirements

According to Yuan Diwen, robots deploying AI large models face the requirements of 3D spatial computing, multi-fusion, and high real-time.
3D spatial computing, that is, robots autonomously navigate in real physical space and perform various operations, which requires precise and high-frame-rate spatial computing capabilities; multi-sensor fusion, that is, the data generated by different sensors need to be synchronized and fused in space and time; high real-time, that is, high real-time 3D spatial computing capabilities and hardware and software collaborative optimization.



This makes robots have high requirements for computing resources, memory and bandwidth, and power consumption. Computing resources: large robot models usually contain more than billions of parameters, and powerful computing power is required to process the reasoning and calculation of these models in real time; memory and bandwidth: due to the large number of model parameters, a large amount of memory is required to store and access model parameters and intermediate calculation results; power consumption: for mobile robots, battery life is a key issue, and chips need to balance computing and power consumption to ensure battery life.

One of the key technologies of the robot large model end-side chip is high real-time NPU. First, it is necessary to have higher computing power and multi-core parallelism to improve the concurrent performance of multi-model operations and improve the throughput of operations; second, efficient Transfme operations to improve the computing efficiency and utilization of Transformer and improve the real-time performance of large models; third, low-bit quantization to reduce memory requirements, storage requirements, bandwidth requirements, and improve the real-time performance of operations; fourth, weight compression to improve bandwidth utilization, reduce system bandwidth requirements, reduce data transfer delays, and improve the real-time performance of model operations.



The second key technology of the robot large model end-side chip is 3D spatial computing processing capabilities. First, the spatial computing dedicated processing unit - the deep computing engine, has the same computing power resources far exceeding the embedded system; the most advanced 3D perception processor provides industrial-level three-dimensional point cloud information; human-like fusion of data information, and synchronous multi-dimensional perception capabilities.

Second, the spatial computing dedicated processing unit - the perception fusion engine, can use multiple 3D sensors to fuse higher-quality and more detailed three-dimensional world information for the perception of the three-dimensional world; the unique time fusion unit makes the perception time deviation between multiple sensors less than 0.1ms, which greatly improves the precision of fine multi-sensor perception and control. No sensor is applicable in any scenario, so multi-sensor fusion is very important.


Written at the end


Recently, people are talking about "physical intelligence" and "physical AI". Meta artificial intelligence researchers have said before, "The last step of true intelligence must be physical intelligence (physical ligence)." Robots are different from previous mobile phones and PCs. They will further promote the development of artificial intelligence and move artificial intelligence from the digital world to the physical world.




Reference address:Large models connected to robots place high demands on end-side chips

Previous article:Musk draws a blueprint for the future, with a market value target of ten times that of Apple
Next article:Tesla's Optimus Prime robot is a new member of the future factory

Latest robot Articles
Change More Related Popular Components
Guess you like

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号