Li Hao, Senior Algorithm Expert at Alibaba
As early as 2010, Google spent 100 million US dollars to acquire the image shopping search website Like.COM, setting off a global image search trend. Text search scenarios are limited, and image descriptions are more precise. Microsoft, Amazon, and Baidu have all made moves, and Alibaba has also invested in an image search shopping website (now called "Tao Tao Search"), which indexes the corresponding store links on the Internet by identifying physical objects in images.
Unfortunately, with the advent of the mobile Internet era, the image search trend quickly died down. The popularity of real-life photos taken with mobile phones made search results increasingly uncontrollable, the image search application experience was greatly affected, and many startups were on the verge of bankruptcy.
"Comparing real photos is much more difficult than comparing original photos on PC. It is beyond the capability of traditional image search technology to handle it," said Li Hao.
Since traditional image search technology is no longer viable, can deep neural networks, which are amazing in the field of vision, work? To this end, Li Hao spent the entire National Day holiday to verify this idea.
"He was very excited, and kept showing it to this and that, promoting it very vigorously," Li Hao recalled the scene when he handed the Demo to his supervisor. In this way, the team won an opportunity to show it to the then CEO of Taobao. This time, the demonstration was directly demonstrated on the mobile phone - taking pictures with the mobile phone, comparing and searching the actual pictures with the existing pictures in the library, and finding and displaying similar pictures - the performance was doubled compared to traditional algorithms.
Soon, the "Image Search" project was officially launched in 2014, with the goal of landing on the Taobao Mobile (mobile Taobao application) platform. Pan Pan, who had just been at Alibaba for three months, was appointed as the person in charge, taking care of the coordination of algorithms, engineering, and products, and the team was full of strength. Pan Pan graduated with a Ph.D. from the University of Illinois at Chicago. Previously, he worked in the field of vision research and development at Mitsubishi Boston Research Institute in the United States and Fujitsu R&D Center in Beijing.
Pan Pan is a senior algorithm expert in the field of visual intelligence research at DAMO Academy
Continuing the technical path promoted by the previous team, "Image Search" adopts deep learning technology, becoming the first C-end application product in Alibaba's history to adopt deep learning technology and go online.
Unlike most Internet companies that think strategically first, Alibaba did not approve projects in a drastic manner in the early stages of technological exploration. Instead, it looked for niches in its existing core businesses, conducted restrained and cautious experimental verification, and then pushed for implementation.
"When an organization has fewer algorithms and R&D and is more composed of business and products, it means that everyone's understanding of technological uncertainty will be very limited," Pan Pan said. "For an Internet company, if you do a project, you must complete it and see the results."
This is a difficult and fortunate process. When action precedes cognition, difficulties such as lack of resources, lack of trust, and inability to display will follow. This is determined by the profit-making nature of commercial companies and is also a test that new things must go through in their embryonic stage.
Fortunately, both top-down idealism and bottom-up innovation have been preserved, and they have been spared from short-lived slogans and ideas.
As long as the spark is there, it can start a prairie fire.
2. Sitting on a mountain of gold and eating steamed buns
"Sitting on a mountain of gold and eating steamed buns" is what Ma Yun said when Qi Yuan joined iDST. The mountain of gold is the rich data owned by Alibaba. But even if you sit on a mountain of gold and eat steamed buns, it is difficult to become a big fat man in one bite. "If the value of data cannot be mined, it is just ordinary soil."
With the popularization and application of deep learning algorithms and models, "parameter adjustment" has become a daily routine for most algorithm engineers, and the search teams of Taobao and Tmall were no exception at the beginning.
Due to the unexplainable nature of deep learning algorithms, many solutions based on this technology are like a "black box". The selection and adjustment of parameters in the model have become an elusive task, which often means tedious and clueless work with no technical content.
In Qi Yuan's view, parameter adjustment alone is far from enough to establish a technical system. "Although it is an engineering job, it still requires scientific guidance - the best engineering guidance is science, otherwise you can only be a parameter adjustment engineer."
Jin Rong also holds the same view as Qi Yuan. "We used to do some parameter adjustment work, until Jin Rong came and put us on the right track," Li Hao said, "He often asked us, why does deep learning work? Can you explain it theoretically?"
After the "Graph Search" project, Li Hao came to the Search Technology Department, one of Alibaba's core algorithm departments. Here, Li Hao met Jin Rong, who came to the front line of the business.
Li Hao's main job at the time was to compress and accelerate deep learning models. The general approach was to apply existing models, but Jin Rong would usually provide new ideas. "He gave us a bunch of formulas and asked us to try them," but this trial lasted for three months and no results were produced.
When Li Hao and his colleagues approached Jin Rong with trepidation, he did not blame them, but encouraged them, saying, "If you can do it in three months, it's too easy. Keep doing it!" It was not until the fourth month that the algorithm finally worked. This algorithm combines embedding technology with deep learning and introduces it into the search business, significantly increasing the GMV of Taobao's main search.
Li Hao recalled that Jin Rong also made a very long theoretical proof to prove that the algorithm was convergent, and shared it internally. "The theoretical guidance he gave us at the time was exactly what we lacked," Li Hao was very grateful for this.
Qi Yuan, who came to Ant Financial, received the project of intelligent customer service, which was to solve the customer service problems of Alipay through intelligent interactive robots. This time, he was much smoother. After obtaining the support of Dai Shan, the head of the group's customer service department at the time (Dai Shan was one of the 18 Arhats who founded Alibaba in the early days), he quickly obtained funds and resources to verify the technology.
In the early days of Alibaba Technology’s development, a driving force of idealism was formed, represented by Alibaba partners.
During the 2015 Double 11 shopping festival, Alipay customer service, which first adopted deep learning technology, achieved 94% voice self-service, which means that 94% of incoming calls no longer needed to be transferred to manual service. The following year, this number increased to 97%. Excluding the artificial intelligence team's salary and computing resource costs, the intelligent customer service project saved the company more than 100 million yuan.
As the saying goes, "know people and use them well, and make the best use of their talents", the same is true for technical tools. Only by understanding AI can we use it well.
It is not easy to establish awareness and belief in new technologies in an Internet company. This sets up one obstacle after another for scientists and even inevitably causes staff turnover.
But looking back, perhaps it was the experience of working together "up and down the mountains" that truly opened up the dialogue system between "R&D" and "business", allowing the highbrow and the lowbrow to blend together.
After technology comes the advanced challenge of product engineering.
Even with the support of senior management, it does not mean that everything is safe. Instead, it brings greater pressure. In the first year of the project, Tusou set a clear goal of more than one million daily active users. "From the beginning, it was no longer an experiment."
Unlike the initial exploration of deep learning algorithms, the challenges in the later stages are like a bottomless pit that cannot be filled.
"The key is that we are not making an independent app, but putting it on Taobao Mobile," Pan Pan said, "and it is Alibaba's most core business platform." Landing on Taobao Mobile means that image search needs to call Taobao Mobile's underlying interface, and needs to make additional customization and adjustments to Taobao's internal link architecture, and the biggest challenge is to make these links flow.
In the field of vision, the compression of large-scale images consumes a lot of computing power, which poses a hidden danger to large-scale image search and access. Pan Pan still remembers an unexpected alarm.
One day, the image search server suddenly crashed and an alarm sounded in the background.
After an emergency investigation, the team discovered that it was the default compression function for image uploads in the Taobao backend that brought down the server. The default compression is mainly for low-frequency, low-volume media uploads, but it does not take into account the special case of image search—large data volumes and the need for real-time recognition, so the compression function has been preset on the front end. In other words, Taobao's default image compression is a burden for image search.
Before the alarm was raised, everyone overlooked such a subtle interface. Pan Pan said, "This is often the case. Even if we consider it well, problems will still occur if we connect it to a larger system."
Previous article:Broadcom Integrated released its semi-annual report after listing: revenue increased by 23.79%, with ETC chips taking the lead
Next article:Machine vision is growing rapidly. How to find practical application scenarios?
Recommended ReadingLatest update time:2024-11-16 16:37
- Popular Resources
- Popular amplifiers
- Virtualization Technology Practice Guide - High-efficiency and low-cost solutions for small and medium-sized enterprises (Wang Chunhai)
- Implementing a Deep Learning Framework with Python (Zhang Juefei, Chen Zhen)
- Intelligent computing systems (Chen Yunji, Li Ling, Li Wei, Guo Qi, Du Zidong)
- Multi-port and shared memory architecture for high-performance ADAS SoCs
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- A very down-to-earth question: How do capacitors absorb ESD?
- C Language Core Technology (Original Book 2nd Edition)
- Do you know the chip manufacturer of this Bluetooth module?
- ADS1.2 Issues
- Limited-time free download | NI white paper: "Three Visions of the O-RAN Alliance"
- Disassembly of isolator
- Which domestic motor drive company do you think is the most promising?
- What should we pay attention to when developing headphone speakers?
- Analysis of Optimized Programming of TMS320C6000 Embedded System
- 【Qinheng RISC-V core CH582】 3 Light routines and button acquisition initial test