Technology entrepreneur Xia Fen: Using AI to create AI
Text | Liu Fangping
Report from Leiphone.com (leiphone-sz)
In recent years, many AI scientists have left Baidu to start their own businesses, and Xia Fen is one of them. In June last year, he founded Zhiyuan Technology, which focuses on developing an automated machine learning platform (AutoML). The company's product is called Ebrain, and it has received two rounds of financing so far.
The purpose of AutoML is to achieve automated modeling of machine learning, or in layman's terms, to use AI to create AI. Of course, in detail, it is not to automate the entire process of artificial intelligence application, but to lower the threshold for using this technology so that more people can use it. In an interview with Leifeng.com, Xia Fen explained this in detail.
As a technical entrepreneur, Xia Fen also expressed to Leifeng.com the challenges he encountered in the process of transforming into an entrepreneur. He said,
Entrepreneurship is different from scientific problems. Scientific problems have clear boundaries and are either zero or one. But there are many factors in company entrepreneurship, and sometimes it may not be solved by scientific methods. It may require some artistic and ambiguous methods.
About Summer Powder
Dr. Xia Fen graduated from the Institute of Automation, Chinese Academy of Sciences, and studied under the master of machine learning, Professor Wang Jue. He is the founder and CEO of Zhiyuan Technology Co., Ltd., focusing on automated machine learning platform products.
With more than 15 years of experience in research and application in the field of machine learning, he was a senior scientist at Baidu and was responsible for Baidu's ultra-large-scale machine learning team. He developed an ultra-large-scale discrete sparse architecture automated machine learning platform (Pulsar), which covers more than 80% of the company's business lines, including Baidu's core commercial monetization systems Fengchao, Finance, and Nuomi. It ranks first in the number of users among the company's internal machine learning platforms.
He has published many articles in top machine learning conference journals such as JMLR, ICML, and NIPS.
Entrepreneurship: The passion of technical people and the olive branch of capital
"Technical people have feelings. They hope that their technological research and development will be recognized, that they can develop world-class technology, and then they hope that its influence will be as great as possible." Xia Fan replied to Leifeng.com.
Looking back at his time as a PhD student in machine learning, Xia said that one way to get recognition for a technology was to publish papers at top conferences. Later, he realized that technology needed to be implemented and influence others, so he joined Baidu and developed a set of AutoML technologies in Baidu's largest advertising business line, Net Union.
But in a large enterprise, everyone is a screw, and their work goals are limited by a box. Xia Fen's box is the continuous improvement of the CTR of the click prediction system of the network alliance. But he hopes to have a bigger platform, so he came to the Big Data Laboratory (BDL) of Baidu Research Institute. Based on the research institute, Xia Fen launched the industry's first commercial online learning system based on a trillion-scale deep learning network and a fully automatic machine learning platform Pulsar. Pulsar is widely used by various business lines of the company. The platform covers most of the company's business lines, including Phoenix Nest, Network Alliance, Finance and Nuomi, and has received unanimous praise.
"In the internal platform, we scored first and were used by 30 business lines in two years," Xia Fen told Leifeng.com.
During this process, he discovered that his influence could be further expanded, so he thought of going beyond Baidu and applying technology to various industries.
In addition to his passion as a technician, Xia Fan was also inspired by the national policy of "mass entrepreneurship and innovation". Moreover, he told Leifeng.com that some investors put money in front of him at that time. "Some investors often ask you if you want to start a business. If you do, my funds are there, waiting for you to start a business."
Such favorable timing, location and people finally prompted Xia Fen to take the step of starting her own business.
He expressed his gratitude to his former club:
Baidu is a company that attaches great importance to technology, and the status of technical personnel there is different. Baidu provides me with a large number of scenarios. No matter how good a technical person is at studying power technology, if he is not given a scenario and no practical things to work on, he cannot accumulate experience, nor can he find problems to improve his technology. Baidu can provide a lot of collective resources in terms of data and computing power, and has a very large scale of problems, so you can get good practical training here.
Product: Automated machine learning lowers the barrier to entry for AI
Technological progress must eventually be implemented in actual economic production. This is why AI+ has become something that various industries and even countries have vigorously promoted after the popularity of artificial intelligence in recent years. Andrew Ng said that artificial intelligence is the water and electricity of the future, which means that it needs to have a low enough threshold so that people from all walks of life can easily use it.
However, as the saying goes, different trades are like different mountains. It is not easy to deeply integrate a computer science technology into another industry. There are several ways to solve this problem. One is to train more artificial intelligence experts and let them learn professional knowledge from different industries. Many companies, governments and universities are working hard in this regard, including the AI MOOC Academy under Leifeng.com.
However, the training cycle of artificial intelligence talents is very long, and the shortage of AI talents has long plagued the industry. According to the "Institutions of Higher Education Artificial Intelligence Innovation Action Plan" issued by the Ministry of Education, China's artificial intelligence talent gap exceeds 5 million. Such a huge demand will definitely not be met in a short period of time.
Another direction is to lower the threshold of machine learning, which is exactly what Xiafen's entrepreneurial team is doing. Ebrain, a product of Zhizhou Technology, is an automated modeling platform for machine learning. Its function is to use AI to replace the part of the machine learning modeling process that requires a lot of manual operations, so that ordinary enterprise technicians can easily use machine learning without having to master machine learning themselves.
Regarding Ebrain, Leiphone.com discussed with Xia Fan on some key issues:
Leifeng.com: What kind of market pain points prompted you to choose AutoML?
Xia Fan: From a professional perspective, I have witnessed the engineers' hard work in adjusting parameters, which is extremely tiring. I think engineers must be liberated from this repetitive work (high-end talents should focus on forward-looking research).
From the perspective of the enterprise, it improves its efficiency and saves its R&D costs and labor costs.
For business personnel, it has gone from impossible to possible (committed to providing tools to enable non-professionals to acquire AI capabilities).
Leifeng.com: What are the advantages of AutoML and what are the key problems it solves?
Xiafen:
Automated model parameter adjustment, saving workload and lowering the threshold;
Automatic feature extraction, deformation and combination to find effective features that affect the results;
Automatic model structure design, such as how many layers a neural network has and the relationship between each layer.
Leifeng.com: What are the limitations of AutoML?
Xia Fan: If we can make the scenarios universal, it may consume slightly more computing resources, but it will always be cheaper than people.
Leifeng.com: What do you think about the current competition in AutoML in China?
Xia Fan: What we do is more like Google AutoML, but we can support private deployment by enterprises. Zhiyuan Technology is the first company in this field in China.
Leifeng.com: The goal of machine learning is still to solve specific problems, and to apply it to all walks of life requires a deep understanding of the problems in each industry. Generally, companies that provide customized machine learning services will also be equipped with professionals in the field to help understand the problems, formulate corresponding solutions, and develop corresponding ML models to solve them. To what extent can the current level of AutoML replace this process, and which parts are difficult to replace?
Xia Fen: The business-related parts are difficult to replace with automated machine learning and require the participation of business personnel, such as digitization, data collection, problem definition, and goal setting. Of course, machine learning scientists can master these problems through short-term learning.
Feature extraction----modeling-----optimization, these processes can be automated.
Leifeng.com: At this stage, AutoML can effectively solve model optimization problems such as model architecture design and hyperparameter selection. There are other requirements in commercial solutions, such as front-end data collection, data preprocessing, and long-term maintenance and evolution of models after they are launched. Do you have targeted technologies for these requirements? Do you have long-term plans?
Xia Fan: Zhiyuan Technology can currently help enterprises in preprocessing, feature extraction, modeling, and optimization through automation. In the future, ETL and online model evolution will also be integrated into products.
Leifeng.com: What are the current application cases? Can you introduce one in detail? During the cooperation, what does Zhiyuan Technology provide, what does the company need to do, and what effect is ultimately achieved?
Xia Fan: Taking content recommendation applications as an example, pharmaceutical companies will push some content (i.e. articles) to doctors through WeChat, email, etc. After the push, doctors will read or like the articles. Now we need to predict the content that doctors are interested in based on their characteristics and historical reading and liking records, so as to make accurate content recommendations.
The conventional approach is to extract a large number of features from doctors and texts, perform feature selection and transformation, select appropriate algorithms and corresponding hyperparameters, and train models. The optimal features, algorithms, and hyperparameters are selected based on the results on the validation set. All the selection processes are done manually, which consumes a lot of manpower and computing resources.
In response to this, Zhiyuan, based on text structured processing, uses the massive computing power provided by cloud computing to automatically build a customer interest model in a very short time through Ebrain, and provides core content recommendation service capabilities. Ultimately, information is recommended based on the doctor's interests, and according to industry standards, it is estimated that customer content access will increase by more than 50%.
Leifeng.com: What does Ebrain mean to the development of artificial intelligence?
Xia Fan: Lower the threshold for machine learning; allow ordinary engineers and business personnel to easily use machine learning; everyone can become a data scientist.
Leifeng.com: Currently, large cloud service providers all provide artificial intelligence cloud services, offering strong computing power and software services, on which enterprises can build and train models. As a non-large cloud service provider, will Ebrain encounter problems in deployment, such as computing power, data, interfaces, etc.?
Xiafen: Product sales model: private deployment + cloud SAAS service, large customers have customized solutions. They are all standard interfaces, there will be no problems.
Leifeng.com: If large cloud platforms also launch AutoML, how can Ebrain maintain its competitive advantage?
Xia Fan: We are quite confident in our technology and algorithm accumulation; we can do private deployment.
We are not just machine learning, but machine learning automation + productization, and only by achieving automation can machine learning be productized. Automated machine learning has a high technical threshold, and the difficulty lies in "automation", which requires deep accumulation in algorithms and practices.
The most difficult part of automated machine learning is the optimization problem. I give you an objective function and I need to find a point that minimizes the objective function. There are many research methods and many solutions to the objective function. Automated machine learning is that the objective function is not differentiable, the feedback mechanism is unclear, and the computational complexity is high, so it is very costly to try all of them. To turn the non-differentiable into a differentiable optimization problem, approximation is required. It is reported that artificial intelligence defeated chess masters in the 1980s through brute force search, evaluating each step and selecting the one with the best score, but it is not the same in Go. The high complexity cannot be searched out, and the exhaustive search cannot be searched out at all, so we need to solve the approximate problem, approximate the unsolvable problem into a solvable problem, find the objective function, and make the objective function cover each solution with a high probability, while reducing the complexity of the solution. We have innovated many algorithms in this regard. (Neither humans nor machines can find the optimal solution. Machines have a large range and high efficiency, so the effect is better than humans.) In the past, chess searched 200 million times per step, but now it only needs to be done 30 million times because of optimization.
The biggest breakthrough in automated machine learning is in algorithm design. You need to find a solution that approximates problem A to problem B. For example, Google AutoML is done using reinforcement learning. It also has a finite value, and there is a generation probability under the finite value. I have several candidates, all of which may be the optimal solution. I put some probability distributions for each optimal solution here, and then I randomly draw a point based on the probability distribution and try it out. The feedback from the test will change the distribution form of the probability. Eventually the probability distribution form changes, and the probability of the most likely optimal solution covers a larger probability.
Leifeng.com: What is the company’s main work at present?
Summer powder: polished product.
From Technician to Entrepreneur: Managing a Machine Learning Company Using Machine Learning Methods
For Xia Fen, the transition from a technician to an entrepreneur is a huge change, and it also brings many new challenges. In his opinion, there is a big difference between being an academic and being an entrepreneur, and the issues involved are much more complicated:
First, when doing academic research, you may just focus on one problem to study, but when running a business, there are many problems to be solved, and each problem requires different abilities and skills.
Second, in the past, you only needed to take care of yourself to solve problems, but as an entrepreneur, it is different. There are many people behind you and you need to be responsible for them. "It used to be very simple. I just did one thing as a scientist. Now I have to deal with these people as well."
Third, I used to learn only one thing, but now I have to learn a lot of things. "I have also observed some companies that are doing well. In fact, I am constantly learning from the beginning of starting a business to the operation of the company."
In the management process, Xiafen explored a business management method similar to machine learning, which includes three parts: input, output, and the middle. For a company, the input is money and manpower, and then after the middle steps, the output is as close to the target as possible. The middle is the complicated part.
How to manage people? How to use money? How to maintain customers? What is the development rhythm? In the middle is the parameter adjustment process. The parameter adjustment process is the same as AutoML. What is the difficulty? In the past, when doing machine learning, it was easy to know the import, but there is a residual between the training target and the final target. The residual is used to adjust the parameters in reverse. One problem with AutoML is that the residual cannot be found, so you need to define the residual yourself and then fit it.
The same is true for running a business. After establishing a mission, what should we do in the next stage? We need to set a sub-goal, and this goal must be quantified. After achieving the sub-goal, we can move forward according to the goal and then turn it into a new goal.
However, in the process of adjusting to the role of entrepreneur, difficulties are inevitable. "I think behind every entrepreneurial venture is a very sad process. Even if you see that the entrepreneur is very successful, he may secretly wipe away tears behind the scenes many times." Xia Fen told Leifeng.com.
Zhiyu now has more than a dozen people, and will soon have more than 20 people, more than half of whom are technical. Xia Fen said that they also encountered the problem of AI talent shortage, and his solution is not only to recruit people, but also to train talents himself. Xia Fen was also a teacher before, and trained many AutoML talents when he was at Baidu.
In addition to talent, there are actually many other things, "for example, exploring the direction and negotiating with customers. We had never encountered these before and felt it was difficult in the middle. But fortunately we overcame them step by step," said Xia Fen.
But this is also a process of growth. Xia Fen said that entrepreneurship is a process of tempering. When you have tempered yourself to a certain extent, your mentality will become stronger and stronger, and you can also see your own growth in this process.
And I am more and more certain that our company will succeed. Why? Because we really create value for society. Many companies have reduced costs and increased profits because of us. The rest is how we make things happen.
Leifeng.com is recruiting several AI and IoT journalists
Job Responsibilities:
Follow up regular reports and interviews in the AI industry/IoT field;
Independently plan relevant topics and write industry analysis articles.
Require:
1-3 years of experience in technology media is preferred;
Have a strong interest in the AI industry/IoT field;
Good logical thinking; good writing skills; good English skills.
Please send your resume to:
lizongren@leiphone.com (Shenzhen); liufangping@leiphone.com (Beijing)
Other positions click here Recruitment notice
◆ ◆ ◆
Recommended Reading
Follow Leiphone.com (leiphone-sz) and reply 2 to add the reader group and make a friend