Federated learning provides a new learning paradigm with broad applications, without leaving the local data and enjoying the benefits of big data training models.
▲Click above Leifeng.com Follow
It’s only been two years since it was proposed.
Text | Jia Wei
Leifeng.com AI Technology Review: Recently, Blaise Aguëray Arcas, one of the originators of the concept of federated learning, held an online workshop on federated learning in South Korea for the world.
Blaise Aguëray Arcas joined Google in 2014. Prior to that, he worked as a distinguished engineer at Microsoft. After joining Google, Blaise led the on-device machine intelligence project at Google, and was also responsible for basic research and new product development.
The concept of federated learning was first proposed by Blaise et al. in a blog post published on Google AI Blog in 2017. Although it has only been two years since the concept was proposed, research on it has become very popular, with at least one related paper published almost every day. At the end of 2018, federated learning even became an IEEE international standard under the promotion of Professor Qiang Yang of HKUST and others.
The main reason why federated learning has been able to quickly transform from an idea into a discipline in such a short period of time is that federated learning technology, as a learning paradigm, can solve the "data island" problem while ensuring user data privacy.
However, unlike the domestic focus on federated learning for "data islands" between enterprises, Blaise and others (perhaps also representing Google to some extent) are more concerned with federated learning on devices, which is also the application scenario when the concept of federated learning was first proposed.
Blaise started researching federated learning shortly after joining Google five years ago. It wasn't until 2017, when they achieved some results, that they published it in a blog post.
At first, federated learning was just a concept, but it was soon developed into a discipline in the field of artificial intelligence. There are already thousands of articles discussing federated learning. There will also be a special topic on federated learning at NeurIPS, the top machine learning conference held in Vancouver in December this year. On the other hand, many companies are now also building their models based on this. This shows that the entire artificial intelligence community has begun to pay attention to this technology.
So why has federated learning been taken seriously by the entire community so quickly?
As you all know, artificial intelligence has now developed to a point where we hope to be able to do more work with less data. This is also one of the core topics of current artificial intelligence.
Neural networks can do a lot of cognitive work, such as language processing, speech synthesis, image recognition, and even playing Go. These can reach or even surpass the level of humans. This is what we have achieved in the past few years. However, compared with humans, current neural networks still lack one thing, which is learning efficiency. They need a lot of data for training. Therefore, when some large companies, such as Google, Microsoft, and Amazon, began to provide artificial intelligence services, they needed to collect a lot of data to train large neural networks. This is also what the entire community has been doing.
For smart applications on the device side (such as mobile phones), the usual model is that the data generated by the user on the device will be uploaded to the server, and then the neural network model deployed on the server will be trained based on the large amount of data collected to obtain a model, and the service provider will provide services to users based on this model. As the data on the user's device side is continuously updated and uploaded to the server, the server will update the model based on these updated data. Obviously, this is a centralized model training method.
However, this approach has several problems: 1) The user's data privacy cannot be guaranteed, and all data generated during the user's use of the device will be collected by the service provider; 2) It is difficult to overcome the lag caused by network delays, which is particularly evident in services that require real-time performance (such as input methods).
Blaise and others wondered whether they could create a large-scale distributed neural network model training framework so that users could get the same service experience while keeping their data local (training on their own devices).
The solution is: upload weights, not data.
We know that neural network models are made up of connections between neurons in different layers. The connections between layers are realized through weights. These weights determine what the neural network can do: some weights are used to distinguish between cats and dogs; another group can distinguish between tables and chairs. Everything from visual recognition to audio processing is determined by weights. Training a neural network model is essentially training these weights.
For example, input method is a typical intelligent recommendation application. When people use Google keyboard Gboard to send messages to family and friends, traditionally, the data of your keyboard typing will be uploaded to Google's server. They collect a large amount of data to train an intelligent recommendation that is more in line with user habits. But after applying federated learning, the user's keyboard data will always remain locally. There is a constantly updated model in the user's mobile phone that will learn and update based on this data, and encrypt the updated weights and upload them to the server. After the server receives a large number of user models, it will conduct comprehensive training based on these models and feedback to the user for model update and iteration.
It may be worth emphasizing here that this model on the device side is compressed, not like the large neural network model in the server. Therefore, the energy consumption of model training is very small and almost undetectable. In addition, Blaise gave a very vivid metaphor, that is, people will update their brain cognitive system through dreaming when they sleep; similarly, the system of the device terminal can also train and update the model when it is idle. So overall, this will not have any impact on the user experience.
Let’s summarize the process of on-device federated learning: 1) The device downloads the current version of the model; 2) Improve the model by learning from local data; 3) Summarize the improvements to the model into a relatively small update; 4) The update is encrypted and sent to the cloud; 5) Instantly integrate with other users’ updates as an improvement to the shared model.
The whole process has three key links:
1) Each mobile phone makes personalized improvements to the model locally based on user usage;
2) Forms an overall model modification plan;
3) Apply it to the shared model. This process will continue to cycle.
The advantages are obvious.
First, we don’t have to upload data to the cloud, so service providers can’t see the user’s data, which can improve the privacy of user data. Therefore, in this way, we don’t have to make a trade-off between privacy and functionality, but can have both. This is particularly important in the current situation where data privacy is increasingly valued.
Secondly, it reduces latency. Although the 5G era is coming, the Internet speed is not guaranteed at any location under any circumstances. If all user data is uploaded to the cloud, and the service itself is also fed back from the cloud, then in an environment with slow Internet speed, network latency will greatly reduce the user experience. However, this will not happen with services supported by federated learning because the service itself comes from the local area.
Of course, perhaps another benefit is that under the traditional method, users are just spectators of artificial intelligence - I use it, but I don't participate. In the federated learning scenario, everyone is a "dragon tamer" and everyone is a participant in the development of artificial intelligence.
In fact, the idea of federated learning is not only applicable to the privacy protection and model update of device user data. We abstract the device user as the owner of the data, which can be a mobile phone holder, a company, a hospital, a bank, etc., and the server or cloud is regarded as a comprehensive model sharing platform.
Therefore, federated learning is a new learning paradigm with the following characteristics:
Under the framework of federated learning, all participants have equal status and can achieve fair cooperation;
Data is retained locally to avoid data leakage and meet user privacy protection and data security requirements;
It can ensure that all parties involved can exchange information and model parameters in an encrypted manner while maintaining independence and achieving growth at the same time;
The modeling effect is not much different from that of traditional deep learning algorithms;
Federated learning is a “closed-loop” learning mechanism, and the model effect depends on the contribution of data providers.
Such characteristics are exactly what is facing the current dilemma in the development of artificial intelligence.
Currently, most application fields have the problem of limited data and poor quality. In some highly professional sub-fields (such as medical diagnosis), it is even more difficult to obtain labeled data sufficient to support the implementation of artificial intelligence technology.
At the same time, there are insurmountable barriers between different data sources. Except for a few "giant" companies with massive users and product and service advantages, most companies find it difficult to cross the data gap in the implementation of artificial intelligence in a reasonable and legal way, or they need to pay huge costs to solve this problem.
In addition, with the development of big data, paying attention to data privacy and security has become a global trend. The introduction of a series of regulations such as the EU General Data Protection Regulation (GDPR) has further increased the difficulty of data acquisition, which has also brought unprecedented challenges to the implementation of artificial intelligence.
New arrival! "AI Investment Research" has now launched the complete video of the CCF GAIR 2019 summit and white papers on major theme sessions, including the Robotics Frontier Session, Intelligent Transportation Session, Smart City Session, AI Chip Session, AI Finance Session, AI Healthcare Session, Smart Education Session, etc. "AI Investment Research" members can watch the annual summit videos and research reports for free, scan the QR code to enter the member page to learn more, or send a private message to teaching assistant Xiao Mu (WeChat: moocmm) for consultation.
Featured Posts