Lenovo CTO Rui Yong: How are we transforming and planning in artificial intelligence?

Latest update time：2018-03-21

Reads：

Text | Yi Xin

Report from Leiphone.com (leiphone-sz)

Dr. Rui Yong officially resigned from Microsoft Research Asia as Executive Vice President in November 2016 and became Chief Technology Officer and Senior Vice President of Lenovo Group, responsible for the planning and formulation of Lenovo Group's technology strategy and R&D direction, and leading the work of Lenovo Research Institute. At the end of 2017, Dr. Rui Yong was elected ACM Fellow for his contributions to image, video and multimedia analysis, understanding and retrieval.

In the more than one year since he took office as Lenovo's CTO, as Dr. Rui Yong said, Lenovo is undergoing a transformation from "device/infrastructure only to device + cloud and infrastructure + cloud powered by AI". What are the advantages of Lenovo Research Institute in the development of artificial intelligence, and how will Dr. Rui Yong's expertise in multimedia computing be combined with Lenovo's products and businesses?

Leifeng.com AI Technology Review recently conducted an exclusive interview with Dr. Rui Yong. The article was compiled based on the interview content conducted by the Association for Computing Machinery (ACM) with Dr. Rui Yong, and edited and deleted without changing the original meaning.

How did you get into multimedia computing research?

When I was studying for my undergraduate and master's degrees, my major was control theory and large-scale system optimization. These expertise played an important role in my later research work in the multimedia field, such as "relevance feedback", neural networks and deep learning.

I started working on multimedia analysis and retrieval while doing my PhD at the University of Illinois at Urbana-Champaign. At that time, the Internet was still in its infancy, web browsers had just appeared, and search engines had not yet been invented. The concept of image search was quite advanced at that time.

At that time, I encountered a great opportunity. The National Science Foundation of the United States established and funded the "Digital Library" project. I was fortunate to be involved in it. I combined the three fields of control theory, information retrieval and computer vision, and conducted in-depth interdisciplinary research. Eventually, I became one of the first researchers to implement image search based on relevance feedback, creating a completely new model for image search. "Relevance feedback" is a method of optimizing search results by analyzing users' previous search results and behavior patterns.

After receiving my PhD, I began an 18-year career at Microsoft, where I continued to work in the areas of multimedia analysis, understanding and retrieval, machine learning, computer vision, and pattern recognition.

Now, as Lenovo's Chief Technology Officer and leader of Lenovo Research Institute, I will continue to lead the team to advance the development of multimedia computing and incorporate the most cutting-edge multimedia research results into Lenovo's products and services.

What progress and applications has Lenovo made in artificial intelligence research? What difficulties and challenges are there?

Lenovo Research Institute has currently established a company-level artificial intelligence platform to support research in areas such as computer vision, speech, and natural language understanding .

在计算机视觉领域，我们已经取得了显著进展。比如我们在 2017 联想 Tech World 上展示了 E-Health。E-Health 是应用于医学领域的智能医疗图像辅助诊断解决方案。它集成了前沿的深度学习算法，依托于拥有强大计算能力的联想云平台，凝聚了众多医学专家全方位的诊疗经验。它一方面在减轻医生工作量的同时，也可以避免由于医生疲劳等因素而产生的误诊情况；另一方面能够智能分析医疗图像自动为医生提供辅助诊断的意见。

In the 2017 Liver Tumor Segmentation Challenge, the E-Health team defeated all competitors and won the championship.

We have also developed Lenovo's first-generation Morningstar AR glasses and AR platform that integrate advanced slam technology and computer vision technology, and are committed to combining AR with vertical industries such as industrial maintenance, education, and training.

In terms of natural language understanding, Lenovo released its first smart speaker, which can realize song selection, weather query, radio listening and other operations through voice interaction. The natural language understanding and conversation engine are the research results of Lenovo Research Institute. The institute provides a multi-round, multi-subdomain, context-related conversation engine for the smart speaker, and its interactive experience and accuracy have reached the leading level in China.

In the field of human-computer interaction, the automatic speech recognition platform developed by Lenovo Research Institute enables users to complete basic operations such as making phone calls, as well as Internet-based services such as checking the weather and calling a taxi through voice. Currently, the platform has been applied to pre-installed services such as the intelligent voice assistant and application store of Moto mobile phones.

In addition, the Lenovo Xiaole intelligent customer service solution we developed organically combines artificial intelligence customer service robots and human customer service representatives to provide services to customers at any time and any place using the multimedia methods (text, pictures, voice) that customers prefer.

Judging from the acceptance of papers in ACM MM 17, vision is still an important research direction in the multimedia field. In recent years, the combination of computer vision and NLP has become more and more abundant. In the face of this cross-integration between research fields, what R&D advantages does Lenovo have compared to other companies?

Yes, the combination of computer vision and natural language understanding is becoming more and more diverse. Lenovo Research Institute has also made a lot of achievements in this regard, such as the E-Health that we just mentioned and demonstrated at Lenovo Tech World 2017. It can intelligently analyze medical images and automatically provide doctors with auxiliary diagnosis opinions.

From a technical perspective, artificial intelligence algorithms represented by deep learning are and will continue to promote multimedia research. In particular, deep learning has recently built a better algorithm framework based on multi-modality, making it possible to effectively integrate, utilize/retrieve multimedia data across domains.

For example, captioning images and videos. A few years ago, captioning was just about automatically tagging images or videos. Deep learning, however, has established a connection between computer vision and natural language processing, turning scattered tags into a coherent natural language description based on visual content. This is a typical cross-domain application that requires not only understanding vision, but also knowing how to model natural language.

With the further development of related fields and hardware equipment, image/video description will even support a natural language description of the content in one paragraph (or multiple sentences), and will also support a more natural user interaction system; the supported modalities will also go beyond the scope of computer vision and natural language processing. For example, voice features, spatial depth information, text features, etc. can be imported.

Lenovo has invested a lot in artificial intelligence algorithms. The number of researchers in Lenovo Research Institute's artificial intelligence laboratory has increased to more than 100, attracting top players from around the world to join.

In addition to algorithms, Lenovo has many advantages in the development of artificial intelligence, whether it is big data, computing power, or from end to cloud.

Big data: Lenovo has also invested a lot in big data. We are the largest manufacturing enterprise data cluster in China, with more than 12PB+ of data, 30TB of data added every day, and more than 15 billion pieces of information processed.
Computing power: Lenovo has strong computing power. 87 of the world's top 500 supercomputers are Lenovo's. Lenovo has ranked first in China and second in the world in the HPC TOP 500 list for the fourth consecutive time, and has become the fastest growing HPC manufacturer in the world with a growth rate of 17%. Currently, 87 of the world's top 500 companies use Lenovo's HPC in their supercomputer systems.

Lenovo actually has a very good understanding of vertical industries. No matter how good an algorithm is, it must be combined with vertical industries, which is also Lenovo's advantage.

In addition, we have unique advantages from end to cloud. Lenovo has a device portal, which allows devices and services to be better integrated. There is also a cloud at the back end. Through the cloud and artificial intelligence technology, we can better understand user needs, so that we can provide better, more intimate and more personalized services. The three elements of devices, services and cloud are organically combined to form a mutually amplifying, positive feedback loop.

In terms of the R&D team, we are vigorously building our innovation team. At the end of last year, I was elected as a Fellow of the Association for Computing Machinery (ACM), a very prestigious organization, for my contributions in the fields of image, video and multimedia analysis, understanding and retrieval. I am also the first ACM Fellow from a company in mainland China. In addition, the head of Lenovo Research Institute's AI Lab is Dr. Xu Feiyu, formerly of the German Artificial Intelligence Research Center, and Dr. Hans Uskert, a member of the European Academy of Sciences, is our chief AI consultant. I believe that with the efforts of many outstanding talents, Lenovo's innovation capabilities will be greatly enhanced.

You joined Lenovo as CTO in November 2016. Under your promotion, artificial intelligence has become an important support point for Lenovo's strategic transformation of "device + cloud" and "infrastructure + cloud" in more than a year. From the perspective of R&D, what artificial intelligence technology is Lenovo focusing on at present? What are the policies and plans?

In terms of technological research and development, Lenovo Research Institute is currently increasing research and development in key artificial intelligence technology areas such as computer vision, speech, natural language understanding, situational awareness, and knowledge graphs.

In terms of layout, Lenovo and Lenovo Research Institute will focus on three directions : smart devices, smart cloud platforms and smart services .

Lenovo is a very powerful device company. We will continue to develop new types of smart devices, not only devices in the traditional sense, but also some devices that can be closely connected with people and can be held in the hands and worn on the body.

We will also vigorously develop software-defined data centers and cross-platform intelligent cloud management platforms to build more intelligent data centers.

In addition, we have also established a company-level artificial intelligence platform, through which we connect devices and services to create vertical field solutions, such as the smart healthcare mentioned earlier, to empower industry transformation and development.

Lenovo currently has three major business structures: PC, mobile phone, and data center group. In addition to using PC and mobile phone as device entrances and data as the basis of multimedia content-related algorithms, from your perspective, what other innovations and possibilities can be made to drive the application and implementation of multimedia content among businesses?

First of all, PCs and mobile phones will change in the future. With the rapid development of 5G, we are focusing on the research of the next generation of PCs and mobile phones. I believe that they will support richer multimedia content and experience.

In addition, Lenovo Research Institute has been committed to the research and development of new smart devices in the future, including wearable devices, AR devices, etc. We will integrate multimedia technology into these new smart devices. For example, SmartCast+, which was demonstrated at Lenovo Tech World in 2017, is the world's first smart speaker prototype launched by Lenovo that has object recognition capabilities and realizes AR experience. It allows artificial intelligence to expand from the sound level to the higher level of image, interaction, and recognition, greatly enriching the actual experience of users.

Speaking of AR, it is now entering a period of great development. In the future, AR may have more diverse forms, such as transparent display overlay, projected display, and more augmented senses.

In addition, multimedia content will also have broad application prospects in vertical industries. For example, the Lenovo Morningstar AR I mentioned earlier is very useful in industries such as industrial maintenance and repair, and education.

From a technical perspective, the background training platform is the key to improving the efficiency of multimedia content training. For example, the Lenovo-level artificial intelligence platform we built is a distributed deep learning platform that supports multiple open source frameworks, can realize distributed task scheduling, and can face multiple AI applications through multi-node parallel acceleration of experiments, algorithm research and model iteration. It has sufficient and effective training data, including both public databases in the industry and Lenovo's own accumulated big data.

Smartphones are one of the main channels for people to consume multimedia content, and Lenovo is also a smartphone manufacturer. Based on the current research and product development progress, what do you think the smartphones of the future will be like?

From a technological perspective, in the future, developments such as artificial intelligence, VR/AR, 5G, real-time translation, new battery technology, and holographic technology will profoundly change smartphones and user experience.

Specifically, borderless screens, neural network processors (NPUs), and more sensors may appear on smartphones. In terms of sensors, mobile phones will integrate biometric sensors, depth cameras, multiple cameras, and better computer vision technology. In addition, the development of 5G will bring 10 times the bandwidth and zero latency to smartphone users.

The form factor of smartphones may also change significantly. One possibility is foldable phones. For example, in 2016, Lenovo Research Institute developed the industry's first true foldable phone prototypes, CPlus and Folio. CPlus can transform between a phone and a watch, while Folio can switch between a tablet and a phone at will.

Lenovo is moving into the AR/VR space with products like Lenovo VR Classroom and Lenovo Mirage, a Star Wars Jedi Challenges AR device jointly launched by Disney and Lenovo. AR/VR technologies have been around for decades, so why are they going mainstream now?

Yes, AR/VR technology has been around for decades. However, recent technological breakthroughs, such as optical lenses, computer vision, and SLAM (simultaneous localization and mapping), have accelerated the development of AR/VR technology, and its huge potential has begun to emerge. In addition, AR/VR can help solve many pain points in the industry and bring users a new entertainment experience.

I personally think that compared to VR, AR is likely to become a larger and more promising platform in the future. Especially when AR is combined with vertical industries, such as education, training, and industrial maintenance. At Lenovo Tech World 2017, we demonstrated the prototype of the daystar AR glasses developed by Lenovo Research Institute and our AR platform. An engineer demonstrated on site how to use these AR devices and platforms to repair a faulty aircraft engine, vividly explaining the broad application prospects of AR technology in vertical fields.

- END -

Leifeng.com is recruiting editors, operators, part-timers , external translators and other positions

Click here for details Recruitment Notice

◆ ◆ ◆

Latest articles about

■Xiaomi air conditioners are selling like hot cakes. Lu Weibing: A competitor's product that costs 3,000 yuan is sold for 20,000 yuan. Dong Mingzhu is caught in the crossfire. Royole Technology declares bankruptcy. Employees' claims may not be repaid. Zhong Shanshan says he looks down on entrepreneurs who sell goods through live streaming.

■Baidu: Making big model applications more practical

■Dahua Technology joins hands with Hongmeng, is it the direction of the tide or the collision of wisdom?

■Leading the westward expansion of e-commerce, the 150 billionth package will be delivered on Pinduoduo in 2024

■Exclusive: Vipshop Senior Operations Director Fan Li resigns

■Performance exploded! Xiaomi Motors' quarterly revenue sprinted to 10 billion yuan, Lu Weibing said there is no upper limit on the investment in intelligent driving; the widow of the founder of Shanshan Holdings took over from her eldest son as chairman; Zeekr executives called for vigilance against pig-killing scams

■Alibaba Cloud returns to growth track

■Scolding employees and being criticized for being overbearing, Dong Mingzhu: You are so funny, I am the boss; Hycan Auto was exposed to have defaulted on compensation for laid-off employees; Chairman of a state-owned enterprise responded to the high school education of the operations director丨Leifeng Morning News

■1688 is an OEM brand, not following the old path of strict selection

■The Double 11 changes in online retail: Who is driving the direction of the tide?