The smart car track hides the most secretive AI vision player.
This player has yet to officially announce any business progress related to smart cars, but it has repeatedly demonstrated its competitiveness in the most core, cutting-edge, and most coveted autonomous driving system capabilities—champion-level dominance at the world’s top AI conference.
Not only is it outstanding in specific technologies, there are many top-level researches in target detection, semantic segmentation, visual reasoning, etc.; it has also won multiple championships in autonomous driving-related competitions, and even used a pure visual solution with 7 cameras to complete a high-speed , autonomous driving in urban and parking environments.
This player is not Tesla’s AI team, this player is Megvii Technology.
In the recent CVPR, the top AI conference, visual research supported by large models is driving new research in the direction of autonomous driving. Megvii Research Institute, in a competition involving autonomous driving and smart car players, won the opportunity to examine autonomous driving. Champion of driving environment awareness.
The superstar in the field of AI vision has not yet been related to smart cars in business.
But with such technical research and results, can it really be purely academic research?
What self-driving competition did Megvii top the charts?
This competition that Megvii Research Institute participated in is a challenge set up by CVPR 2023 specifically for autonomous driving perception and decision-making systems.
Among them, the champion of the OpenLane Topology Relationship Challenge was won by Megvii.
There are four tracks in the challenge. In addition to the OpenLane Topology Challenge that Megvii participated in, there are also the Online HD Map Construction Challenge and the 3D Occupancy Prediction Challenge. ) and nuPlan Planning Challenge (nuPlan Planning).
Among them, the OpenLane topological relationship track mainly examines the ability of autonomous driving technology to understand scenarios.
The track requirements are based on the OpenLane-V2 (OpenLane-Huawei) data set. Given a surround-view camera photo, the contestants need to output the perception results of the lane centerline and traffic elements, as well as the prediction of topological relationships between these elements.
In other words, this competition does not examine the single recognition ability of lane edge lines or traffic signs in previous autonomous driving perceptions. Instead, it requires autonomous driving technology to be able to perceive lane center lines and understand the logic of lane center lines and traffic elements. Relationships, such as the green light, which means which lane can pass.
So how to determine the winner? The OpenLane-V2 data set provides a judgment standard: OLS score (OpenLane-V2 Score), which is judged by calculating the average value of the sensing results and topology prediction mAP.
Among the 34 participating teams, the team from Megvii Research Institute was the only one to score more than 55 points, reaching 55.19 points, which has a clear advantage.
So, what method did Megvii use?
Megvii’s pure vision solution for autonomous driving
First, in the perception phase, Megvii adopted two different models for the two perception tasks of traffic element detection and lane centerline detection.
For traffic element detection, Megvii uses the latest generation YOLOv8 in the mainstream 2D detection model YOLO series as the baseline. Compared with other 2D detection methods, YOLO is faster and has more accurate performance.
△ Image source: GitHub user RangeKing
In addition, the data set OpenLane-V2 used in the competition marked the corresponding relationship between traffic signs and lanes. Megvii added Strong augmentation, Reweighting classification loss, Resampling difficult samples, Pseudo label learning and Test-time augmentation in the YOLOv8 training process, a total of 5 This trick generates features corresponding to traffic elements by interacting with front-view images.
For lane centerline detection, Megvii uses the self-developed PETRv2 model as the baseline. PETRv2 provides a unified purely visual 3D perception framework that can be used for 3D object detection and BEV segmentation.
In this competition, Megvii used PETRv2 to extract 2D features from multi-view images, and used the camera frustum space to generate 3D coordinates, and input the 2D features and 3D coordinates into the 3D position encoder .
The 3D position encoder is then used to generate key and value components for the Transformer decoder , and lane queries interact with image features through the global attention mechanism to generate 3D lane centerline detection results and corresponding lane centerline features.
In the topological relationship prediction stage, Megvii built a multi-stage network framework based on YOLOv8 and PETRv2, used the results generated by the two perception tasks to splice corresponding features, and then used two layers of MLP to predict the corresponding topological relationship matrix.
(Picture note: Megvii’s final prediction results on the validation set, including bounding boxes, categories and confidence levels)
Finally, judging from the OLS score, the method of the Megvii team is ahead of other contestants in traffic element perception (DETt), topological relationship prediction between lane lines (TOPll), and topological relationship prediction between lane lines and traffic elements (TOPlt). By.
The most secretive AI vision player on the smart car track
Participating in this competition is the MFV (Megvii-Foundation model-Video) team of Megvii Research Institute.
The first author of the competition results paper is Wu Dongming. He obtained a bachelor's degree from the Xu Class of Beijing Institute of Technology in 2019. He later went on to study for a doctorate in the Department of Computer Science at Beijing Institute of Technology, studying under Professor Shen Jianbing. In 2022, he became a research intern at Megvii Research Institute.
The other authors of the paper are also from Megvii Research Institute, including Chang Jiahao, who graduated from the University of Science and Technology of China, and Li Zhuoling, who graduated from the University of Hong Kong.
It is worth mentioning that the PETRv2 model used in this challenge was one of the academic achievements released by the research team led by Dr. Sun Jian, the founding director of Megvii Research Institute, before his death.
Moreover, this is not Megvii’s only autonomous driving-related research results.
In addition to the PETR series of large models, Megvii has also released the B EVD epth detection model (which can achieve high-precision depth estimation for 3D targets), LargeKernel3D (for the first time proving the feasibility and necessity of large convolution kernels for 3D vision tasks), BEVStereo ( nuScenes pure vision solution 3D target detection S OTA ), etc... are all industry-leading technological achievements.
△ BEVStereo model framework
Megvii Research Institute has always been the research and development "brain" of Megvii AI technology, focusing on deep learning and computer vision. It also includes AI productivity platform Brain++, open source deep learning framework Tianyuan MegEngine, mobile high-efficiency convolutional neural network ShuffleNet, etc. The birthplace of results, it has published more than 120 papers at the world's top conferences, won more than 40 championships in top competitions, and has more than 1,300 business-related patent authorizations.
Moreover, unlike corporate research institutes that are pure R&D or cutting-edge technology pre-research layouts, Megvii Research Institute has been used as a combat force from the beginning. Therefore, Megvii Research Institute’s latest results and the direction it aims at are generally not done on a whim. , or research purely for research’s sake.
So this is what Megvii needs to pay attention to after it has successively produced top results in the field of autonomous driving and smart cars.
Compared with its old friend SenseTime, Megvii has not officially announced any smart car, autonomous driving business or cooperation, while SenseTime has launched a dedicated smart car business brand Jueying, led by co-founder Wang Xiaogang, with the goal of becoming SenseTime’s pillar new growth engine.
Regarding the trillion-dollar track such as smart cars and autonomous driving, will Megvii remain calm and stay put? Not too possible.
What's more, everything from technical research capabilities to technical implementation levels have been demonstrated through the summit.
Moreover, Megvii Research Institute also showed a self-driving pre-research demo, which can realize self-driving on highways and urban areas using only 7 cameras, and can also complete horizontal, vertical and side parking.
What level is this?
For reference, Tesla, the pure vision king, requires at least 8 cameras for its autonomous driving perception solution.
Previous article:US university develops new software to safely develop/test self-driving cars using realistic environments
Next article:Hyundai Motor: North America does not join Tesla Supercharging standard, charging speed is 3 times slower!
- Popular Resources
- Popular amplifiers
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Design of interpolation filter based on FPGA
- Fundamentals of Digital Electronics (Fifth Edition)
- Arm C/C++ Compiler
- 【TI recommended course】#Lecture on basic knowledge of electronic circuits#
- 【GD32450I-EVAL】+ 01 Unboxing and onboard resource evaluation
- How do I calculate the capacity of the supercapacitor I need?
- Defining large arrays in C2000
- Is the shift from LDMOS to GaN a technological leap?
- An article explains the relationship between the stability of fully differential voltage feedback amplifiers and the feedback resistor value
- Planar spiral inductor design