Article count:16439 Read by:87952319

Hottest Technical Articles
Exclusive: A senior executive of NetEase Games was taken away for investigation due to corruption
OPPO is going global, and moving forward
It is reported that Xiaohongshu is testing to directly direct traffic to personal WeChat; Luckin Coffee is reported to enter the US and hit Starbucks with $2, but the official declined to comment; It is reported that JD Pay will be connected to Taobao and Tmall丨E-commerce Morning News
Yu Kai of Horizon Robotics stands at the historical crossroads of China's intelligent driving
Lei Jun: Don't be superstitious about BBA, domestic brands are rising in an all-round way; Big V angrily criticized Porsche 4S store recall "sexy operation": brainless and illegal; Renault returns to China and is building a research and development team
A single sentence from an overseas blogger caused an overseas product to become scrapped instantly. This is a painful lesson. Amazon, Walmart, etc. began to implement a no-return and refund policy. A "civil war" broke out between Temu's semi-hosted and fully-hosted services.
Tmall 3C home appliances double 11 explosion: brands and platforms rush to
Shareholders reveal the inside story of Huayun Data fraud: thousands of official seals were forged, and more than 3 billion yuan was defrauded; Musk was exposed to want 14 mothers and children to live in a secret family estate; Yang Yuanqing said that Lenovo had difficulty recruiting employees when it went overseas in the early days
The app is coming! Robin Li will give a keynote speech on November 12, and the poster reveals a huge amount of information
It is said that Zhong Shanshan asked the packaged water department to sign a "military order" and the entire department would be dismissed if the performance did not meet the standard; Ren Zhengfei said that it is still impossible to say that Huawei has survived; Bilibili reported that employees manipulated the lottery丨Leifeng Morning News
Account Entry

Exclusive | Multi-modal large model start-up "Sophon Engine" recently completed an angel round of financing of 10 million yuan

Latest update time:2023-05-22
    Reads:

" Large models will usher in a 'hundred schools of thought' in April and May, and the battle will become increasingly fierce, with new players still entering the market one after another. "


Author | Huang Nan
Editor | Chen Caixian

Leifeng.com learned: Recently, a multi-modal large model startup company "Sophon Engine" completed an angel round of financing worth 10 million yuan. It is understood that the CEO of "Sophon Engine" is Gao Yizhao, a young man born in the 1990s. Gao Yizhao is a doctoral student at Renmin University, studying under Lu Zhiwu, who serves as a consultant in the "Sophon Engine" company. In addition, Lu Zhiwu serves as the chief AI scientist of iSoftStone.

Before ChatGPT became popular, Beijing Zhiyuan Artificial Intelligence Research Institute took the lead in pioneering large-scale model research in China, called "Enlightenment". At that time, four major forces were gathered, including Tang Jie of Tsinghua University, Liu Zhiyuan of Tsinghua University, Huang Minlie of Tsinghua University, and Wen Ji of the National People's Congress. Rong takes the lead (for details, please pay attention to Leifeng.com's follow-up in-depth report "Behind-the-scenes details of Zhiyuan's development of China's large models". Interested readers are welcome to add the author's WeChat: Fiona190913 ).
Among them, Wen Jirong of the Renmin University of China mainly led the scientists of the Hillhouse School of Artificial Intelligence of the Renmin University to develop the direction of multi-modal large models, named "Wenlan". Lu Zhiwu served as the main model force in the team, and his student Gao Yizhao also participated Came in and completed the core research work. After "enlightenment", Tang Jie, Liu Zhiyuan and Huang Minlie all established companies based on large model technology, and the entry of the National People's Congress opened a perfect prelude to the entrepreneurial lineup of the "Four King Kong" of Zhiyuan Large Model.
According to Leifeng.com, Lu Zhiwu’s team is also the first team in China to study multi-modal large models and achieve outstanding technical results.

01

Lu Zhiwu and Gao Yizhao

Lu Zhiwu and Gao Yizhao started working on multi-modal large models in 2020.
In May 2020, GPT-3 developed by OpenAI set off a huge wave in the field of artificial intelligence, attracting the attention of domestic practitioners on pre-trained large models, including Lu Zhiwu and others.
Lu Zhiwu studied in the Department of Information Science, School of Mathematical Sciences, Peking University in his early years. After graduating with a master's degree, he obtained a PhD from the Department of Computer Science, City University of Hong Kong in 2011. His main research directions include machine learning, computer vision, etc.

Lu Zhiwu

At that time, most people in China focused on the field of NLP, but few people paid attention to the multi-modal large models that expanded from text to images and videos.

During this period, the Hillhouse Artificial Intelligence Department of Renmin University of China established a multi-modal large model R&D team specializing in the research and development of graphic and text multi-modal pre-training models. It was led by Wen Jirong, and other core members included Song Ruihua, Lu Zhiwu and others. , which is also the first team in China to engage in multi-modal large model research.
In the same year, Gao Yizhao entered the Renmin University of China’s Hillhouse School of Artificial Intelligence to pursue a doctoral degree, studying under Lu Zhiwu.

Gao Yizhao


02

"Sophon Engine" will launch multi-modal large models

In fact, as early as three years before ChatGPT was born, Beijing Zhiyuan Artificial Intelligence Research Institute had taken the lead in starting research on a large model in China called "Wudao". Among them, scientists from Renmin University's Hillhouse School of Artificial Intelligence, led by Wen Jirong, Formed the "Wudao·Wenlan" team to engage in research on multi-modal large models, with Lu Zhiwu serving as the main force in model development.
In March 2021, based on the pre-training of 30 million image and text data sets, the first-generation "Wenlan"-image and text retrieval model BriVL was officially launched. This is a very large-scale multi-modal pre-training model that uses dual The tower structure can encode images and text separately, and learn the similarity between images and text through self-supervised tasks.
Based on the image and text retrieval model, the research team also developed an H5 small application "AI Mood Radio". You only need to provide a picture to the AI ​​genie, and the model can match a suitable piece of music to the picture.
Three months later, Lu Zhiwu’s Wenlan team released “BriVL-2” (BriVL-2).
Based on the hypothesis of weak correlation between vision and language, the research team proposed the hypothesis of weak correlation between images and text, designed an efficient cross-modal contrastive learning strategy, and proposed a distributed multi-modal training framework based on DeepSpeed ​​to improve the model's expressive ability and Generalization.
Based on the pre-training of 650 million weakly related image and text data sets, Wenlan 2.0 has a model capacity of 5 billion parameters. It is currently the largest Chinese general image and text pre-training model and can cover multiple fields and scenarios. It has achieved excellent performance in text retrieval and generation tasks, such as image retrieval, image description, visual question answering, etc.
During this period, Gao Yizhao was also deeply involved in the graphic and text pre-training work of Wenlan 1.0 and 2.0, and was mainly responsible for data processing, model training and evaluation, etc.
In the heat of ChatGPT, Lu Zhiwu and Gao Yizhao saw new opportunities for multimodal research in the era of large models, and established the multimodal large model company "Sophon Engine". Drawing on previous experience in participating in the development of Wenlan models, the "Sophon Engine" team officially launched a self-developed multi-modal dialogue large model on March 8 this year, and released the first application-level multi-modal ChatGPT product "Yuanlan". Multiply the elephant ChatImg".
"Yuanchengxiang ChatImg" has tens of billions of parameters. It mainly uses image-text pair data and VQA data as training sets, and simultaneously performs multiple tasks such as image-text matching, image-text retrieval, image description generation, and text description generation. train. According to the pictures or text input by the user, "Yuancheng Xiang ChatImg" can conduct intelligent chat, tell stories, write advertisements, etc.
Since April and May, the large-scale models that have been unveiled one after another have caused a lot of noise and excitement, with big manufacturers and start-up companies making no concessions. It is a major trend for academia to enter the field of large models. How to find one's competitiveness and position in the competition close to engineering requires urgent answers from the race against time.
//

Recent popular articles


Latest articles about

Xiaomi air conditioners are selling like hot cakes. Lu Weibing: A competitor's product that costs 3,000 yuan is sold for 20,000 yuan. Dong Mingzhu is caught in the crossfire. Royole Technology declares bankruptcy. Employees' claims may not be repaid. Zhong Shanshan says he looks down on entrepreneurs who sell goods through live streaming. 
Baidu: Making big model applications more practical 
Dahua Technology joins hands with Hongmeng, is it the direction of the tide or the collision of wisdom? 
Leading the westward expansion of e-commerce, the 150 billionth package will be delivered on Pinduoduo in 2024 
Exclusive: Vipshop Senior Operations Director Fan Li resigns 
Performance exploded! Xiaomi Motors' quarterly revenue sprinted to 10 billion yuan, Lu Weibing said there is no upper limit on the investment in intelligent driving; the widow of the founder of Shanshan Holdings took over from her eldest son as chairman; Zeekr executives called for vigilance against pig-killing scams 
Alibaba Cloud returns to growth track 
Scolding employees and being criticized for being overbearing, Dong Mingzhu: You are so funny, I am the boss; Hycan Auto was exposed to have defaulted on compensation for laid-off employees; Chairman of a state-owned enterprise responded to the high school education of the operations director丨Leifeng Morning News 
1688 is an OEM brand, not following the old path of strict selection 
The Double 11 changes in online retail: Who is driving the direction of the tide? 

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号