Sun Jian: Megvii does not need to prove itself with championships; optimistic about automatic machine learning technology
Guo Yipu from Kao Chung-kuen Conference Center
Quantum Bit Report | Public Account QbitAI
"Megvii has been established for eight years and does not need these championships or competitions to prove itself."
At the Charles Kao Auditorium in Hong Kong Science Park, Sun Jian, director of Megvii Research Institute, expressed his views on Megvii winning six championships at this year's CVPR.
For Sun Jian, whether he was previously at MSRA or currently serving as the chief scientist at Megvii, practicality is always his first consideration for research.
We asked Sun Jian six questions during the AI and Vision Summit in Hong Kong.
First of all, let me give you a brief overview. Professor Sun Jian put forward the following points:
1. The Greater Bay Area has many top AI talents, and Shenzhen, a manufacturing center, allows AI companies to be exposed to different practical needs.
2. We are optimistic about automatic machine learning, which can improve efficiency on the one hand and lead to new discoveries on the other.
3. There is only one way for newcomers to grow, and that is to keep doing very difficult problems that they were not good at before.
The following content has been slightly edited by Quantum Bit without changing the original meaning.
What is the difference between MSRA and Megvii in conducting research?
Sun Jian: People have a misunderstanding about MSRA, thinking that it is all about research. It is true that many people in Microsoft Research are doing research, but there are also many people working on technology and applying it to products.
Therefore, the work at MSRA and Megvii is very similar, and the principles and ways of doing things are basically the same. Because my own style is more practical, I hope to concentrate on making the research I do generate value as soon as possible.
The reason for joining Megvii is that deep learning has a huge impact and can make great changes to the world. Startups are closest to the front line and can apply these technologies the fastest. In addition, deep learning has a variety of application scenarios. Many scenarios may not be of concern to large companies because they have their main business, but small companies will work on these scenarios, and can see the essential problems that need to be solved in the industry, bringing different understandings.
Whether at MSRA or Megvii, the general principles are basically the same. Research requires finding the core problem, investing energy and time, and persisting in doing it. The quality of "persistence" is very important no matter where you are. Image recognition is a long-term task and we have to invest a lot of energy in it.
We are the company's R&D platform. Often we are constantly improving accuracy, which is the same as the research process, striving to achieve the best results. Therefore, we need to use research methods to solve problems, focus on the core issues of vision, continue to invest people and energy, and keep moving forward, so that product technology can continue to improve. This is why we can combine research and product technology well.
The scale of MSRA and Megvii is different. In the past, there were not so many people working on this at Microsoft. There were probably 50 computer vision researchers in the entire Microsoft Research Center, which was already very large in the world at that time. Now, we have nearly 500 computer vision researchers in Megvii Research Center. We can devote more energy and more people to solving the problem of computer vision.
In addition, Megvii also develops underlying training systems and the Brain++ artificial intelligence deep learning basic framework. We have a dedicated engineering team to develop our own deep learning engine.
How do you view Megvii's successive world championships?
Not long ago, Quantum位 reported that Fan Haoqiang, the algorithm director of Megvii Research Institute, won the fourth world championship at CVPR . Previously, Megvii also won several MS COCO project championships.
Sun Jian: We are called a research institute, but we have always been a research institute that prioritizes product technology. In other words, research results can be directly or indirectly applied to products. This is also the characteristic of computer vision. Once it is developed, it can be widely used, rather than putting the cart before the horse and trying to win the championship to prove ourselves.
Megvii has been established for 8 years and does not need these championships or competitions to prove itself . Instead, it needs stronger, best and differentiated products to prove itself.
This year, they won 6 championships in CVPR, and I didn’t even know they participated in the competition beforehand.
As for publishing papers, I neither encourage nor oppose it. Many papers are written by interns, and we will also guide them.
What is the existence of the championship-winning artifact Brain++, and how is it different from open source frameworks?
Sun Jian: In Megvii, Brain++ has two meanings. In a narrow sense, Brain++ refers to our core training engine. When we were working on it, there was no TensorFlow. TensorFlow was released after the first version of Brain++ was released. At that time, TensorFlow was not very mature. But we are better than TensorFlow, so we have been using our own Brain++.
The advantage in comparison is that Megvii's Brain++ can make many specific optimizations in computer vision.
TensorFlow is a large codebase. Although it is open source, some core parts are still controlled by Google and are updated regularly. However, the various applications we make require us to quickly make the improvements we want to the deep learning training engine. The only way to win the world of martial arts is to be fast, and market competition requires us to be fast. With our own Brain++, if we want a function, it may be ready next week and put into the system, so we can use it as soon as possible, which speeds up the development process.
As R&D expands, Megvii has developed its own AI technology ecosystem in the core training engine, which is generally known as Brain++. As a team collaboration platform and algorithm factory, Brain++ not only includes the original training framework, but also a data management platform and a computing platform.
Today, Megvii mentioned more about Brain++ in a broad sense. As a company-level AI training platform, it needs to manage tens of thousands of GPUs and let many people work together to efficiently manage these computing resources. The amount of data is very large, and standard open source systems cannot accomplish these things.
One of the features of Megvii's Brain++ platform is that everyone can log in to it like a virtual machine. Unlike other large companies that submit jobs after debugging, Megvii's use of virtual machines can not only provide a desktop experience, but also run on large-scale systems, allowing training and debugging at the same time, which is something others cannot do, and this method greatly improves the efficiency of researchers.
In addition, when many people share computing resources, others can automatically mobilize when resources are idle. This set of efficient things is also managed by Brain++.
The Brain++ platform also supports the use of various open source frameworks such as TensorFlow and PyTorch. At present, our engine is very complete, so everyone will actively choose to use their own Brain++ first. The learning curve of this tool is very flat, and newcomers can learn it quickly.
What are the differences between AI development in Hong Kong and Beijing?
Sun Jian: Not only Hong Kong, but the entire Greater Bay Area government is very encouraging the development of AI and creating good conditions. Hong Kong schools have trained high-quality students and have very good student resources. I am here to participate in the discussion of AI in the Greater Bay Area and hope to cover the Bay Area.
In Hong Kong's industry and academia, everyone knows the product Face++, but may not be familiar with Megvii, a company whose product is more popular than the company.
In addition, we have long-term cooperation with some computer vision professors in Hong Kong, and we have the Megvii-HKUST Joint Laboratory with Quan Long (Professor at the Hong Kong University of Science and Technology). This laboratory mainly focuses on the combination of 3D and recognition. On the other hand, we will also jointly train talents.
Comparing the AI talent environment in Beijing and Hong Kong, Beijing is characterized by high talent density and a large total number, and is the city with the most universities in the country. In the Guangdong-Hong Kong-Macao Greater Bay Area, there are many top AI personnel, rapid development, a good environment, and great attraction for talents. One advantage is that Shenzhen, the world's manufacturing center, is here, so there are demands from different industries. Starting a business or doing other things must start from demand, which is a great advantage and can be closer to customers.
What new AI technologies are currently promising?
Sun Jian: Automatic machine learning is a promising direction. Automatic machine learning is not limited to network structure models. In fact, it is already a concept. In the R&D pipeline, loss functions, training data sampling/enhancement/augmentation, and hyperparameters can all be searched. It is not limited to simply searching a network structure, but opens a door to integrate many new ideas. This set of search tools and methods opens up many research opportunities.
It will bring about many changes in the future. On the one hand, it will improve efficiency and eliminate the need for manual tuning; on the other hand, it may really be able to discover things that cannot be discovered manually.
The choice of CV newcomers
Sun Jian: The biggest demand of young people coming to Megvii is the pursuit of growth. At least in the first 3 to 5 years after graduation, the top priority is how to grow quickly.
We will find them tasks that are difficult and challenging enough to help them grow. During the research and development process, we will not only care about the projects, but also care about everyone, and how to motivate them so that they can grow faster.
We have also divided the teams, and the leaders of these teams have grown a lot since I first came. There is only one way to grow, and that is to constantly solve very difficult problems that you were not good at before . This is the only way to grow continuously. After the echelons are established, they will also continue to grow with more new young partners.
Anxiety during growth is a necessary process. Without anxiety, you may grow very slowly. You need to have this anxious process, just like there is a dark period and confusion when studying for a doctorate, and you can't see any hope. After this dark period, you will become stronger.
When facing anxiety, first, our institute emphasizes courageous growth, an open mind, a growth mindset, and a growth mindset. Many things can be changed. Second, we pursue the courage to do difficult and challenging things, and we must bravely accept challenges.
For newcomers in the field of computer vision, I hope they can do things in a down-to-earth manner, understand the basic principles of things, have a growth mindset, and constantly improve themselves.
If you study for a doctorate, the pace is relatively slow for 3 to 5 years, which allows you to think deeply about a problem and really think about a problem in a certain direction. If you go to a company, there are many people with high research level. Compared with many laboratories doing computer vision, we are at the forefront, which can help you grow faster.
-over-
If you like it, click "Like"!