Haomo Zhixing is the first autonomous driving company to build its own intelligent computing center! Assisted driving "first in mass production", urban NOH will be launched in 100 cities next year
Jia Haonan from Aofei Temple
Qubit | Official account QbitAI
"Those who have ideals for software should make their own hardware."
Jobs' creed created an era, and this understanding was put into practice at Tesla by Musk through methods such as building his own supercomputing platform.
Now, the same script is playing out in China’s autonomous driving industry.
Hao Mo Zhixing - a startup company recognized in the industry as "the first in mass production of autonomous driving" - recently released on AI Day: Intelligent Computing Center .
It fully serves autonomous driving research and development and is self-built .
This is
China’s first intelligent computing center established by an autonomous driving company
.
After experiencing the industry competition in 2022, the battle for mass-produced intelligent driving to enter the city has begun, and the key guarantee that determines the implementation, experience and development speed of urban assisted driving is the intelligent computing center.
Tesla has prepared the DOJO Intelligent Computing Center for FSD; in China, Hao Mo Zhixing, which also demonstrates large-scale implementation by focusing on perception but not on maps, has now also brought to the forefront the important tool behind urban NOH.
So how will the intelligent computing center of Hao Mo Zhixing function?
Why is Hao Mo taking the lead in moving toward “relevant” tasks—becoming the first autonomous driving company to establish an intelligent computing center?
The first intelligent computing center established by an autonomous driving company?
Hao Mo Zhixing's self-built intelligent computing center cooperates with ByteDance's Volcano Engine. A considerable part of the computing resources are "contracted" by the autonomous driving business party, and a specialized computing cluster is built in accordance with the specific requirements of autonomous driving development.
The so-called "intelligent computing center" is not a supercomputer in the traditional sense.
The core functions of autonomous driving, or smart cars, are actually AI. Specifically, large-scale deep learning algorithms.
Whether it is training or testing such a model, the protagonist is no longer the logical reasoning ability of the traditional CPU, but the floating-point computing power of AI accelerators. GPU is the current mainstream of AI accelerators.
△ Zhang Kai, Chairman of HaoMo Zhixing
Therefore, the first feature of the intelligent computing center is to use large-scale GPU computing power as the basis for AI model iteration.
The second feature is the deep integration with the autonomous driving business. The Intelligent Computing Center provides computing power clusters, performance acceleration tools and AI big data platforms tailored to the characteristics of autonomous driving applications, greatly improving model training performance, GPU resource utilization and algorithm research and development efficiency.
The computing power of MANA OASIS, the Haomo intelligent computing center, is 0.67 EFLOPS (6.7* 10 17 floating-point operations per second)*.
Almost all the computing power of MANA OASIS is used for autonomous driving. Its architecture has also been specially arranged according to the business characteristics of autonomous driving.
The characteristic of autonomous driving task training is that the files are relatively large and complex, and most of them are video and image data. Every time a self-driving user takes over, a separate small file will be formed. With many cars and many users, a data set of more than 10 billion autonomous driving vehicles has been formed.
Therefore, the first requirement is to access and transmit this data with high performance, and the storage bandwidth needs to reach 2T per second.
In addition, different autonomous driving models of MANA are installed on different servers, and the communication capabilities of different servers are also critical. The bandwidth design of MANA OASIS achieves 800G/second.
Computing , storage and communication are the basic capabilities of MANA OASIS.
In terms of optimizing AI model training, the Volcano Engine also provides targeted basic optimization.
For example, AI continues to evolve and develop, and new models and network structures emerge one after another. Transformer, which emerged in the field of NLP in the past few years, has now become the most promising technology for autonomous driving and is also the main "killer trump card" of Wei Mo. The intelligent computing center built by Huoshan Engine for Haomo can support more than 200 network structures including Transformer.
In addition, MANA OASIS can support more than 500 high-performance AI operators, high-bandwidth network communications, data parallelism, pipeline parallelism and sparse parallelism that specifically serve very large model tasks.
According to Hao Mo’s introduction, a complete training-convergence cycle for a large AI model with hundreds of billions of parameters currently only takes one week, and the efficiency has increased by 100 times .
Haomo OASIS is also ByteDance's first external technology output from the intelligent computing center. The basic architectural features are completely consistent with the technical route that Haomo has always demonstrated: autonomous driving research and development based on super-large models, super-large data, and rapid iteration.
The company with the most successful To C application of AI technology has joined forces with the number one in mass-produced autonomous driving technology. Behind the architectural features of OASIS is the development trend of autonomous driving:
At the data level, the data organization method with "frame" as the basic unit is changing to the Clip format (a continuous video containing multiple frames) with higher annotation efficiency and data utilization rate .
Cilp brings greater data volume, requires larger-scale AI models, and higher iteration efficiency. In other words, the requirements for image and video processing technology for autonomous driving are higher.
As for Haomo's mass production of autonomous driving, whether it is heavy sensing or large-scale model applications, it has a broader stage in the intelligent computing center.
What can Hao Mo Intelligent Computing Center do?
OASIS has only one core mission: to accelerate the training of large models .
Specifically, it is the training of large models in five aspects. This is the technical guarantee for the implementation of Haimo City NOH, and it is also the source of strength for Haimo NOH to lead .
Video self-supervised large model
The problem solved is how to build a Clips data set more efficiently.
In the past, autonomous driving training data was organized by frame, and frame by frame marked the required targets, such as pedestrians, passenger cars, etc. However, each frame of image can only be labeled with one type of target, which wastes the value of other targets contained in the image.
The purpose of Clips is to use labeled data to automatically label unlabeled data.
The video self-supervised large model first uses a large amount of unlabeled data to form such a basic large model, and then uses a small amount of labeled data for heuristic learning. Through a self-encoding and decoding model, 90% of unlabeled data can be automatically annotated.
After the annotation is completed, all obstacles in a video will be continuously annotated in the video. Data that has not been discovered before can be mined through such data forms and new annotation methods.
3D reconstruction of large model
According to Haomo Zhixing, ByteDance has actually accumulated a lot of experience in e-commerce AI applications. For example, if you take a few photos of a product, you can switch the perspective and restore the 3D model. Behind it is actually NeRF. The main function of this model is 3D modeling and the generation of new perspectives.
Haomo Zhixing also applies this technology to autonomous driving. Use the NeRF model to reconstruct the return Clip. After building it, apply it to data supplementation.
The main function of 3D reconstruction of large models is to generate scarce data that is not found in 2D images through conversion of different perspectives in 3D scenes.
In addition, you can also use the generation network in the 3D scene to change the light and texture of the scene, generate new data, and reduce the error rate of the perception model.
Its significance is still to reduce the cost of manual annotation and generate more valuable data.
Multimodal mutual supervision large model
Autonomous driving has always faced a challenge: in the real environment, there will be many unknown obstacles, and it is impossible to label everything. What should we do?
The solution is to use a large mutual supervision model. First, universal object detection is performed, and the structure of the object is identified to determine the accessibility of this area.
Among them, multi-modality is reflected in the first use of visual models to extract BEV features and detect general structures.
In addition, the point cloud image using lidar will be used for a supervisory verification to continuously improve the results of visual inspection.
Dynamic environment large model
Haomo Zhixing's technical route emphasizes perception and puts less emphasis on maps. This is also the direction currently recognized by most autonomous driving players.
Although high-precision maps appear to be critical to autonomous driving systems, there are issues behind them such as policy supervision, regulatory access, data collection, and information updating.
Especially in China, where infrastructure updates are rapid, the cost of high-precision maps for timely accuracy is difficult to estimate.
However, the technical challenges of light maps are huge. One of the problems of lane topology recognition is which lane to take at more complex intersections and ramps.
For Haomo's large dynamic environment model, we first use BEV to look around to generate basic environmental features, and then input the necessary information of the existing basic navigation map into the Topology Attention network to predict and backtrack on different bifurcation points and merging points, and assign appropriate lanes. The topology is predicted and then handed over to the decision-making system.
Large model of human-driving self-supervised cognition
Hao Mo Zhixing's autonomous driving research and development will be based on the real driving behavior training of many users.
But the problem is that experienced people are a minority after all. If we train on such a large-scale data group, we may end up with an average value. In the process of polishing the AI experienced people, Haomo borrowed the idea of ChatGPT, which is currently very popular.
ChatGPT is a derivative of GPT3. The last generation of GPT3 has 150 billion parameters, which has achieved qualitative change through quantitative change in the cognitive understanding ability of the model. The core method is actually to use feedback from human behavior for reinforcement learning.
For example, for a certain question, first train a basic network through human answers. For similar questions, multiple answers are sorted to let the AI know which answer is better. This will train a value model.
With the value model in place, AI can perform continuous training and iteration during the generation process, and finally screen out the best results and reduce the bad ones.
For autonomous driving, Haimo has planned a set of rules. If the user drives according to the recommended method and does not take over the process, it is a good case. If it is taken over, it is a Bad Case. By adding the models trained by Good Case and Bad Case into the large model, a closed loop that strengthens the training structure can be formed.
Haomo said that due to limited computing resources before, it always had to be conservative when iterating the above five large models.
Now, with the help of the intelligent computing center and abundant computing resources, these five large models can be formally developed and moved towards the "data-driven" 3.0 era of autonomous driving.
The first act of the 3.0 era is the mass production of Haomo Smart City NOH .
Urban NOH is essentially an auxiliary function for urban road navigation, which means that the system will take over the vehicle in most cases, autonomously identify lane lines, obstacles, traffic , speed limits, etc., and combine navigation information to plan travel routes - truly The intelligent driving of "from P gear to P gear" greatly reduces the user's driving burden. It is also the function that has brought the core value closest to the experience of ordinary people since the birth of autonomous driving technology.
Therefore, the mass production of urban pilot assistance is also the goal that various autonomous driving companies and car manufacturers are racing to compete and prove their strength this year.
NOH is very likely to be the first to achieve large-scale mass production in this competition.
The core reason is the heavy-sensory technology route including the five large models above.
For example, the large video self-supervision model can automatically label 90% of the targets in a piece of video data, which is equivalent to labeling only 10 frames to generate 100 frames of data. The standard labor costs and time are greatly reduced.
The 3D reconstruction of large models can automatically generate more valuable data from limited data.
The large mutual supervision model, the large dynamic environment model, and the large human-driving self-supervision cognitive model improve NOH's capabilities in target recognition accuracy , path prediction planning , and overall riding experience respectively.
No matter how much you say , seeing is believing:
For Haomo Zhixing in 2022, Chairman Zhang Kai summed up the "three major battles":
In the battle of data intelligence, the system is completed and we are heading towards the era of big models, big computing power, and big data.
In the battle for assisted driving in urban scenes, NOH has reached a state of delivery.
The battle for automatic distribution of terminal logistics has initially completed the commercial closed loop, with more than 1,000 units delivered.
In 2023, Haomo Zhixing still has the goal of leading in "mass production".
First of all, the city NOH function will soon be put into mass production and will be first installed on Wei brand models of Great Wall Motors.
By the time 2024 arrives, NOH will be implemented in 100 domestic cities. the most important is:
Because Haimo NOH does not rely on high-precision maps, it eliminates the process of mapping and compliance, will have faster mass production, and can achieve indiscriminate coverage of major urban roads across the country.
In the race to implement urban pilot assisted driving, Haomo Zhixing NOH is currently the undisputed number one in terms of both speed and scale of mass production.
Why Hao Mo Zhi Xing?
For Haomo Zhixing, both ordinary users and practitioners who are concerned about the development of smart cars are already very familiar with it.
"Leading the team with the legendary driver of autonomous vehicles", "Great Wall Motors' trump card for transformation", "No. 1 in mass production of autonomous driving"...
These are all labels on Haomo Zhixing, which has only been established for 3 years.
Objectively speaking, the three years of Momenta have been the fastest three years for mass production of smart driving in China. The Moment model and Moment speed have been hotly discussed repeatedly in the past three years.
However, the establishment of the intelligent computing center once again led the way - the first autonomous driving company to build an intelligent computing center.
Why Haomo?
First of all, because this is the need for implementation progress.
The large-scale implementation of urban assisted driving has brought about the problem of large-scale data training. Self-built intelligent computing centers can be more efficient, more cost-effective, and more sustainable - players who really enter the large-scale implementation of urban assisted driving may need to build their own Intelligent computing center.
Haimo Execution has made the fastest progress in mass production, so it took the lead in starting construction and became the first among autonomous driving companies.
The deeper reason is Hao Mo Zhixing's technical route: a route that emphasizes perception over maps, and the use of large models. This route has higher requirements for data scale and iteration.
But the most fundamental reason, the answer given by Gu Weihao, CEO of HaoMo Zhixing, is "entrepreneurial spirit":
The most winning weapon of Fei Mo is the entrepreneurial spirit formed by Fei Mo students together to face difficulties. This invincible entrepreneurial spirit is the greatest magic weapon for us to face challenges, take the lead, and keep moving forward.
Autonomous driving companies in the industry are more likely to choose "light and agile" technical system building methods to avoid "asset-heavy" cost investments. Therefore, even if companies like Tesla previously built intelligent computing centers, they were based on the car factory's perspective and focused on assets. It doesn't matter, the focus is still on cost and efficiency improvement.
But Hao Mo Zhixing thinks from first principles and sees that autonomous driving will be implemented on a large scale. Intelligent computing centers are inevitable. No matter how difficult it is, we must do it. No matter how important it is, we must do it. It seems that the most difficult path is the most correct. road.
In fact, this is consistent with the entrepreneurial spirit and technical background of Hao Mo Zhixing.
Before the Intelligent Computing Center, Haomo Zhixing was the first to introduce new technologies such as Transformer. In the superstition of "high-precision maps", it could go against the consensus and choose the technical route of focusing on perception and light maps. It was not mainstream at the time, but later became the industry consensus. It was completely independent thinking. Follow and face the difficulties, the inevitable choices and results under the entrepreneurial spirit and background.
With such a spirit, every autonomous driving innovation and advancement will have a "1", and various technological advancement and implementation results will be constantly added "0"s.
These achievements have been unveiled one after another at every AI Day organized by Haomo Zhixing, making the industry marvel at Haomo Zhixing's many achievements and rapid progress.
Haomo AI Day has become a kind of technical ability test similar to Alibaba's Double 11, and it is held once a quarter, and it has become a "household name".
With the deepening of previous AI Days, the event itself has evolved from a single company displaying results to one of the industry's most anticipated and important trends in sharing cutting-edge autonomous driving technologies and looking forward to business implementation.
After the autonomous driving industry has gone through a race and reshuffle in 2022, with the launch of the Haimo Intelligent Computing Center, some people have begun to believe that the next standard for the speed of mass production of autonomous driving will not only depend on the scale of implementation and on-road capability experience, but also You can use the Intelligent Computing Center as a reference for competitiveness.
What do you think?
-over-
"Artificial Intelligence" and "Smart Car" WeChat communities invite you to join!
Friends who are interested in artificial intelligence and smart cars are welcome to join the exchange group to communicate and exchange ideas with AI practitioners, so as not to miss the latest industry developments and technological advances.
PS. When adding friends, please be sure to note your name-company-position~
click here