Baidu released its trillion-level graph search engine! Four pre-trained models are open source, and 1.5 billion yuan is given away
Jin Lei Mengchen from Aofei Temple
Quantum Bit Report | Public Account QbitAI
Another 520, another year of love...
Stop! This is not the way to open it.
Look~ It’s also for the “festival” , but it attracts developers from all over the world to gather here.
This is the "established" deep learning developer event -
WAVE SUMMIT 2021
.
Baidu PaddlePaddle , as the largest domestic deep learning platform, also brought a lot of "sugar" to developers on days like 520:
-
Released the new PaddlePaddle open source framework version 2.1
-
Release of a new large-scale graph search engine
-
Open source Wenxin ERNIE four pre-trained models
-
Newly released inference deployment navigation map
-
……
In addition, there is 1.5 billion yuan of funds , of which 1 billion yuan will be "distributed" to 100,000 companies and one million industrial AI talents.
Unlike previous summits, this one revealed a new tone - great integration and great innovation .
Baidu Chief Technology Officer Wang Haifeng said:
From a technical perspective, the integration of multiple technologies and innovation, the combination of knowledge and deep learning, have made breakthroughs in knowledge-enhanced deep semantic understanding, greatly improving the effect while maintaining the same parameter scale, and making it more interpretable.
From the perspective of the platform, the deep learning platform and chip hardware and software are integrated and innovated to meet the diverse needs of different computing power, power consumption, latency, etc. in various production environments with different hardware configurations, and achieve the best results in AI applications.
From an industrial perspective, artificial intelligence technology is increasingly integrated with the industry. Driven by industry demand, it continues to hone AI technology and platform capabilities, and integrates innovative development with application scenarios.
△
Baidu Chief Technology Officer, Wang Haifeng
In addition, lowering the threshold for AI is another focus of the summit and is the core of accelerating diversity and industrial progress.
As for integrating innovation and lowering the threshold of AI, how to bring AI value into the industry and achieve high-efficiency and high-quality mass production, Baidu Group Vice President Wu Tian believes:
AI industrial mass production will first be realized in the production activities of enterprises in stages. With further development, it will move from multi-person and multi-task division of labor and collaboration within enterprises to AI mass production and collaboration across society.
△
Wu Tian, Vice President of Baidu Group
Next, let’s take a look at WAVE SUMMIT 2021 in one article.
Six new releases
PaddlePaddle Open Source Framework Version 2.1
As the largest domestic deep learning platform, it once again ushered in an upgrade at this summit - version 2.1.
Highlight the key points!
Four major functions Focus on optimization :
-
Automatic mixed precision optimization : Taking ResNet50 and BERT as an example, after enabling this function, the training speed can be increased by 3 times.
-
Dynamic graph function enhancement : Added inplace operation function, reduced video memory usage by 17%; optimized Python/C++ interaction overhead, increased training speed by 10%.
-
High-level API : Added support for GPU preprocessing, mixed precision, and model sharing mechanisms.
-
In particular , the upgrade of the Custom Operator function has greatly reduced the learning and development costs of developers' custom operators and greatly improved the flexibility of development.
This also opens up a panoramic picture of Baidu PaddlePaddle after the 2.1 version upgrade .
In this panoramic picture, in addition to the above-mentioned core framework development function optimization, Baidu PaddlePaddle's upgrade goes far beyond that.
Large-scale graph search engine
The release of Baidu PaddlePaddle 2.1 in terms of distributed training is a large-scale graph search engine . The core highlights are as follows:
Supports distributed graph storage and retrieval of trillions of edges and supports linear expansion.
For example, in the cooperation process with NetEase Cloud Music , this function was used in the "anchor recommendation"
It supports the training of billion-edge graph models, effectively solves the cold start problem, and improves the effective playback rate of anchor recommendation scenarios.
It is not difficult to see that the release of the large-scale graph retrieval engine has enabled Baidu PaddlePaddle to have stronger industrial scenario application features.
Wenxin ERNIE's four pre-trained models are open source
After the framework layer, comes the model kit layer .
Open source Wenxin ERNIE's four pre-trained models :
-
ERNIE-Gram: proposes an explicit n-gram masked language model, which enhances the effect of the pre-trained model by introducing multi-granularity language knowledge and leads in five typical Chinese text tasks.
-
ERNIE-Doc: To address the problem of insufficient modeling of long texts, we proposed retrospective modeling technology and enhanced memory model mechanism, achieving leading results in 13 long text comprehension tasks.
-
ERNIE-ViL: Aiming at the cross-modal understanding problem, based on the idea of knowledge enhancement, it realizes cross-modal pre-training integrating scene knowledge and achieves leading results in five cross-modal understanding tasks
-
ERNIE-UNIMO: Further enhances the knowledge fusion between different modalities, improves the effects of cross-modal semantic understanding and generation, and text understanding and generation through cross-modal comparative learning, and achieves leading position in 13 cross-modal and text tasks.
When it comes to complex semantic understanding requirements, these four pre-trained models can each play to their strengths.
At the same time, technology integration can be achieved to achieve the innovative effect of "1+1>2".
It can not only understand language, but also understand images, achieving unified cross-modal semantic understanding.
PaddlePaddle inference deployment toolchain, navigation map
In addition to development, training, and suites, upgrades have also been made to various nodes in the inference deployment toolchain:
-
PaddleSlim : Further optimizes pruning and compression technology, adds unstructured sparse tools; takes the lead in supporting OFA compression mode to ensure accuracy after compression.
-
Paddle Lite : Released LiteKit, an "out-of-the-box" toolset for mobile developers, greatly reducing the development costs of edge AI developers.
-
Paddle Serving : Added a new Pipeline mode with fully asynchronous design to better support the problem of model combination in real business.
-
Paddle.js : Added support for multiple backends and mainstream image segmentation and classification models, taking into account high compatibility while also taking into account high performance.
In addition to upgrading the existing inference deployment toolchain, PaddlePaddle also provides an inference deployment navigation map .
It is understood that more than 300 fully verified deployment paths have been covered, thus forming a navigation map as shown below.
In this tree, there is a complete path from the root to each branch, which can help developers successfully implement AI deployment.
Baidu PaddlePaddle gave its reasons for doing so:
Every time something works, there is a trace to follow, and every time something doesn’t work, the root cause can be traced back.
Hardware Ecosystem Achievements
In terms of deployment, releases have both a "soft" and a "hard" side.
It is understood that PaddlePaddle has carried out adaptation and joint optimization work with 22 domestic and foreign hardware manufacturers, including Baidu Kunlun, and has completed or is adapting 31 chips or IPs .
These include chip companies such as Intel, Nvidia, Huawei, Hygon, Rockchip, and Ambarella.
To give a more specific example, PaddlePaddle has adapted more than 50 models on the Hygon DCU.
It can be seen that in terms of the hardware ecology in the deployment stage, Baidu PaddlePaddle has achieved comprehensive coverage of domestic and foreign hardware manufacturers.
PaddleFlow, the core of cloud-native machine learning
As artificial intelligence technology is applied in industries, a wider range of AI development scenarios have emerged, which has put forward more diverse requirements for the platform:
-
Aiming at the development needs of AI applications in a wider range of vertical industries
-
Demand for a deeply customized AI development platform
-
AI-native container services
Based on this, Xin Zhou , director of Baidu AI Product R&D , announced the official opening of PaddleFlow , the core of the enterprise version of Paddle .
△
Xin Zhou, Director of Baidu AI Product R&D Department
In short, this is a cloud-native machine learning core system that is designed specifically for AI platform developers and is easy to integrate.
Its features are also very obvious, namely cloud native, excellent performance, lightweight and easy to use.
It can help AI platform developers to efficiently build more segmented scenarios and deeply customized AI platforms.
……
In addition to the six major releases mentioned above, there are also some major upgrades .
PaddleHelix was officially released last year and was officially upgraded to version 1.0 today. It added a new compound pre-training model ChemRL and applied the ChemRL model to more downstream tasks.
With the capabilities of Propeller, Baidu won double championships in the internationally authoritative graph neural network OGB in March this year on two drug-related datasets, HIV and PCBA.
As the first in China to support quantum machine learning, PaddlePaddle has been updated simultaneously with the PaddlePaddle framework 2.0 and later versions, and its overall operating speed has been greatly improved, with an average improvement of 21.9% in core application scenarios and a maximum improvement of 40.5%.
At the same time, Quantum Paddle has also added new feature extraction methods such as Quantum Kernel Method .
For the difficult entanglement purification task, Quantum Paddle has added an optimized quantum entanglement processing framework , which provides the best and most feasible purification solution in the industry.
Another 1.5 billion yuan will be given out
In addition to the above-mentioned "six major releases", Baidu PaddlePaddle continues to "give out sweets" at this WAVE SUMMIT 2021.
And it was very real "sugar" - money , 1.5 billion yuan .
In the Baidu PaddlePaddle "Great Voyage" plan , in addition to the "Set Sail" for AI talent training in colleges and universities launched at the end of last year, it also includes:
-
"Great Voyage" Escort Plan
-
"Great Voyage" Navigation Plan
"Great Voyage" Escort Plan
1 billion yuan is the investment that the escort plan will make in the next three years.
to whom?
100,000 enterprises and millions industry AI talents.
How to give?
Overall, it can be divided into three aspects, including technology, talents and ecology.
For enterprises, the goal of escort is to achieve intelligent upgrades, shorten the path from technological innovation to commercial implementation through technology empowerment, market promotion and resource introduction, including:
PaddlePaddle Technology Partner Program
,
PaddlePaddle Enterprise Edition
(Gravity)
, and
PaddlePaddle China Tour
.
As for talents, we provide AI private forums , AI fast track and AICA chief AI architect training program .
△
Liu Qian, General Manager of Baidu AI Technology Ecology Department
"Great Voyage" Pilot Program
The target group of this plan is core developers , and the goal is to build an open source ecosystem with community developers and explore cutting-edge technologies.
Including PPDE (PaddlePaddle Developer Technical Expert Program) , PPSIG (PaddlePaddle Community Special Interest Group) , PaddlePaddle Pilot Group, Doctoral Association and other organizational forms.
Cooperate with outstanding open source communities and open source projects in the industry to systematically establish research and development directions, including exploring cutting-edge directions such as biological computing and quantum computing.
It is understood that 120 PPDEs have been certified so far, and the PaddlePaddle city/university pilot group has covered 150 cities.
"AI Talent Industry-Education Integration Training Program" officially released
In fact, before the release of "Pilot" and "Escort", Baidu PaddlePaddle had already launched the "Great Navigation" series of Departure plans at the WAVE SUMMIT+2020 at the end of last year:
In the next three years, PaddlePaddle will invest a total of 500 million yuan in funds and resources to support 500 universities across the country, focus on training 5,000 university AI teachers, and jointly train 500,000 AI students.
After nearly half a year , what results has this plan achieved?
Based on rich industrial practices, PaddlePaddle has added more than 50 practical cases covering all technical directions of artificial intelligence in the development of artificial intelligence practice courses in colleges and universities. By the end of July, the number will exceed 100.
PaddlePaddle has held 14 sessions of deep learning teacher training for college teachers, trained more than 2,000 teachers from 570 colleges and universities, and helped 226 colleges and universities to offer credit courses.
It hosts many competitions such as the China University Computer Competition, and also provides internship programs and career guidance for college students, in order to cultivate compound talents that can adapt to the needs of the industry.
At the meeting, a signing ceremony for cooperation between PaddlePaddle and the innovation and entrepreneurship laboratories of three major universities was also held.
Including Tsinghua University Basic Industrial Training Center, Jilin University Innovation and Entrepreneurship Laboratory, and Zhengzhou University Artificial Intelligence Engineering Application Laboratory.
Together with PaddlePaddle, they will promote the integrated development of production, education, research and application, build a reserve force for industrial intelligence, and usher in a new era of industry-education integration.
Finally, in addition to the six major releases and three major ecological plans, this WAVE SUMMIT also jointly released the PaddlePaddle Open Source Ecosystem Report with the China Academy of Information and Communications Technology (reply "China Academy of Information and Communications Technology" in the background dialog box to obtain it) .
The report points out that the artificial intelligence industry has entered a window period of explosive engineering applications . The open source framework can reduce the difficulty of intelligent upgrading of the entire industry and increase its breadth and depth.
PaddlePaddle has opened up a new Chinese open source ecosystem with regionalized, specialized, and large-scale development, accelerated cross-industry collaborative innovation, and built a talent training system.
The Open Source Framework Frontier Model Reproduction Competition was also officially announced at the event .
This is a sub-track of the Artificial Intelligence Innovation Application Competition hosted by the Academy of Communications Technology, and will be hosted by Baidu. It hopes to discover and cultivate more talents, precipitate more cutting-edge models, and promote the development of artificial intelligence as a whole.
Integration is for better innovation
Integration and innovation is the "main theme" throughout a normal summit from beginning to end.
So what is the logic behind Baidu PaddlePaddle’s push for “integrated innovation”?
First of all, integration and innovation are the needs of the development of the times .
Unlike the algorithm-first approach in the past, artificial intelligence has entered the stage of large-scale industrial production, which requires the combined efforts of algorithms, data, and computing power to exert its influence and create more innovative new value.
The detailed technical upgrades of Baidu PaddlePaddle in the development, training, and deployment stages are based on this principle.
For example, the four open-source pre-training models of Wenxin ERNIE do not take the "single-threaded" route from a technical perspective, but instead generate more innovative value in a "1+1>2" manner.
Secondly, after an enterprise has developed to a certain level, technological development alone is unable to break through the inherent bottleneck in the face of fierce competition in the industry.
Only through cross-border integration and model innovation can we adapt to the increasingly severe competition.
But in addition to the integration of technology, cross-border and other aspects, there is another very important and indispensable point.
That is the integration and innovation of the open source ecosystem of the deep learning platform , including industry, developer community and talent training.
This corresponds to Baidu PaddlePaddle's "Great Voyage" series of plans.
So far, PaddlePaddle has brought together 3.2 million developers, served 120,000 companies, created 360,000 models, and involved in many fields such as medical care, finance, entertainment, environment, energy, and industrial manufacturing.
The reason why it can reach such a scale is because of the integrated innovation in technology, models, talents, cross-border and many other aspects, which has greatly lowered the threshold for AI development and generated more value.
It can not only create a flexible and comprehensive modeling method, but also meet the needs of customized scenarios.
So what is the route to bring the value of AI under integrated innovation into industrial production activities?
In this regard, Baidu Group Vice President Wu Tian summarized a three-stage route:
-
To support the rapid verification and implementation of the pioneer exploration phase , PaddlePaddle provides an industrial-grade model library polished in real scenarios for the industry to introduce AI verification, and solves the "last mile" problem of AI implementation through convenient multi-terminal and multi-platform deployment of inference engines.
-
To help teams in the workshop application phase apply AI innovation, Baidu PaddlePaddle has lowered the threshold so that small teams don’t have to reinvent the wheel. It provides support for the entire process from transplantation and reuse to targeted rewriting and then to completely self-developed research.
-
To support multi-person and multi-task collaboration in the industrial mass production stage , PaddlePaddle improves the efficiency of the entire process through efficient management of computing resources and an integrated development environment for developers. Open source and support for a variety of hardware can enable socialized collaborative production among multiple enterprises.
It can be seen that PaddlePaddle has gone through the entire stage of AI industrial application and found a reference and feasible path for everyone.
With Baidu PaddlePaddle and 520 like this, do you think it is sincere enough?
-over-
This article is the original content of [Quantum位], a signed account of NetEase News•NetEase's special content incentive plan. Any unauthorized reproduction is prohibited without the account's authorization.
click here
Featured Posts
- How Smart Battery Fuel Gauges Can Effectively Improve Battery Life in Continuous Glucose Monitors
- Highorlowbloodsugarlevelscanleadtoserioushealththreats,somonitoringbloodsugarlevelsisofutmostimportance.Currently,150millionpeopleintheworldsufferfromdiabetes,sothereisahugedemandforpersonalportablebloodsugar
- qwqwqw2088 Analogue and Mixed Signal
- FAQ: Developing secure IoT edge-to-cloud applications on Linux using PKCS #11 and secure devices
- LiveTopic:DevelopingsecureIoTedgedevicestocloudapplicationsforLinuxsystemsusingPKCS#11andsecuritydevices|MicrochipSecuritySolutionsSeriesSeminar18 ContentIntroduction:Inthisseminar,youwilllearnhowtoachieve
- EEWORLD社区 Security Electronics
- How to tell whether an operational amplifier is a current feedback type or a voltage feedback type
- Iamanovice,newtooperationalamplifiers.Whenselectingchips,Idon'tknowhowtodeterminewhethertheoperationalamplifierisacurrentfeedbacktypeoravoltagefeedbacktype.Idon'tknowwhatkindofamplifiercircuitissuitableforcu
- xuanyuanzhu LED Zone
- What should we pay attention to when launching new products? In particular, is there any good advice on how to prevent plagiarism?
- WhatshouldIpayattentiontowhenlaunchinganewproduct?Especiallytopreventplagiarismandcopying.Anygoodrecommendations? Areyoutryingtopreventplagiarismofideasorcode? Patentprotectionisabettersolution Theexecution
- AAAIIIAAA Embedded System
- I can't program MSP430 using CCS. Please help.
- Thedriverlookslikeithasbeendownloaded,butitkeepsreportingerrors.Pleasehelpme. Thisisn'tit. WhyistherenoMSPDEBUGINTERFACE? Eithertheconnectionisbad,orthecompilationorhardwaresettingsarewrong,ortryrestarting
- ws77827 TI Technology Forum
- Looking for a replacement chip for MAX14757EUE+T
- Dearexperts!IamlookingforareplacementchipforMAX14757EUE+T.Isthereanysuitablechiptoreplacethischip?Itismainlyusedin60Vscenarios.Ireallydon’tknowwhichonetouse. Hereisthedatasheetforthischip,thankyouvery
- zkc111 ADI Reference Circuit