Is privacy computing a new key to open the door to AI data circulation?
AI applications are facing a development bottleneck due to data fragmentation.
Editor | Wu Wenliang
Not long ago, Dr. Qi Lu, founder and CEO of Qi Ji Chuang Tan, was asked about his current views on AI in a Q&A column. He said, "My enthusiasm and attention to AI mainly comes from the prospects it can bring to our society." In Lu Qi's view, the core of AI is a "general ability to acquire knowledge and use knowledge to achieve goals." This is the most widely used and powerful general ability invented by humans so far . Because knowledge is power, a power that can be applied to anything we want to do.
Looking back at the past few years, driven by data, my country's AI has stepped out of the laboratory and landed in many fields such as finance and security. There is even a saying in the industry that "those who get data get artificial intelligence." In 2020, the State Council listed data as the fifth factor after land, labor, capital, and technology to encourage the circulation and value of data. But unexpectedly, in the short term, AI companies have fewer channels to obtain data. On the one hand, after data becomes a factor of production, individuals and governments have a higher awareness of the value and protection of data, and companies will be more selective in opening up and sharing data types and methods; on the other hand, as one of the factors driving the development of enterprises, data is strictly controlled by enterprises under legal restrictions and their own interests.
The further development of AI companies requires more data, but data cannot flow as smoothly as in the past. AI applications are facing a development bottleneck caused by data fragmentation.
Regarding how to find a balance between AI development and privacy protection, Professor Zhang Bo from the Department of Computer Science at Tsinghua University gave two ideas: one is how to prevent privacy from being misused and abused. The second is how to use technology to protect individual or group privacy, including data security.
The former is a governance issue of artificial intelligence, while the latter is a technical issue.
At the same time, a group of people discovered that privacy-secure computing, which has the characteristics of "data available but invisible", may help AI companies get out of the data dilemma and open the door to data circulation.
1
What is privacy-secure computing?
Privacy-secure computing is a collection of technologies that ensures that data providers do not leak data to the outside during the process of data processing, analysis and calculation, and that data cannot be maliciously attacked or obtained by other unauthorized persons. It can achieve the safe circulation and utilization of data.
A classic question: Two millionaires meet on the street. They both want to know who is the richest, but they don't want to tell each other. How can they know who is the richest without the help of a third party?
This is the "millionaire's problem" proposed by 2000 Turing Award winner Yao Qizhi in 1982. The problem raised by Mr. Yao and his solution have become a major direction in the field of cryptographic security and promoted the development and application of privacy-safe computing technology.
In the past two years, privacy-preserving and secure computing has become a new industry that investors are optimistic about. According to statistics from the Economist Intelligence Unit, the number of newly established companies in my country's privacy-preserving and secure computing industry in 2020 was 71, a year-on-year increase of 33.96%.
In the 12 months from May 2021 to date, 8 financing rounds for 8 companies in this track have totaled more than 1 billion yuan, with an average single-round financing amount of over 100 million yuan.
It is worth noting that most companies’ financing events occurred between 2020 and 2021, which also reflects that more and more investors have discovered the value of privacy-secure computing.
The rapid development of privacy-secure computing is inseparable from the advancement of algorithms and the substantial improvement of computer performance on the one hand, and is also related to policies on the other.
In the past decade or so, privacy computing algorithms have made great progress, including breakthroughs in differential privacy, federated learning, homomorphic encryption, and zero-knowledge proofs. The demand for computing power and communication bandwidth for privacy computing technology has also been greatly improved due to the development of computer systems and hardware. The technology of privacy-secure computing can finally begin to solve practical tasks, not just purely theoretical problems in the computer field.
In terms of policy, with the successive implementation of the Cybersecurity Law, the Data Security Law, and the Personal Information Protection Law, companies are forced to pay attention to and enhance data protection in all aspects of data collection, processing, use, and circulation, and the privacy and security computing industry has benefited from this.
The "14th Five-Year Plan for Digital Economy Development" issued by the State Council in January this year clearly stated: "Key industries are encouraged to innovate data development and utilization models, and on the premise of ensuring data security and protecting user privacy, industry associations, research institutes, enterprises and other parties are mobilized to participate in data value development."
The publication of this document may further accelerate the development and industry application of privacy-secure computing technology.
In the past few years, privacy-secure computing has been continuously extended from the medical industry to different fields such as finance and government affairs, and the entire industry has become increasingly lively.
Ma Lan, vice president of Boiling Point Capital, shared with Leifeng.com her observations on the application of privacy-secure computing in recent years from an investor's perspective.
Ma Lan noted that in 2018, many financial institutions put compliance first, so many companies that used regulation as an entry point grew. After the government officially proposed to use data as a factor of production in 2019, data security was elevated to a position of equal importance to compliance.
However, people have discovered that using data as both an asset and a transaction presents a huge data security issue. A group of people have introduced privacy-secure computing to help solve this problem.
Thanks to the emergence of greater market demand, existing companies in the privacy and security computing industry made great efforts in 2020, and some new startups emerged, and capital also followed suit. Therefore, from 2020 to 2021, privacy and security computing entrepreneurs have found new landing scenarios one after another, and even generated certain income.
In Ma Lan's view, although the privacy-secure computing industry is in a state of dynamic change, it is generally moving towards a positive and more secure state.
2
The collision of AI and privacy-safe computing
Yang Qiang, executive director of the AAAI International Association for Advanced Artificial Intelligence, once told Leifeng.com that since 2019, he has clearly felt that problems such as artificial intelligence being difficult to implement, application models not being universal, and AI products not being universal enough have become more frequent.
In recent years, many countries around the world have listed data as core assets. Data cannot be shared, forming data islands, further hindering the implementation of AI. He believes that data barriers exist in all walks of life. Only by breaking through barriers and increasing the liquidity of data can the AI ecosystem develop better.
Under the requirements of laws and policies, leading technology companies can obtain large amounts of data from multiple channels because of their mature products and large number of users. However, small and medium-sized enterprises do not have such conditions and find it difficult to break through the data bottleneck.
Privacy-secure computing is a way to break through industry data barriers. Privacy-secure computing ensures the security of data during the cooperation process, making data flow naturally smoother.
At present, many entities with large amounts of data have to keep the data information strictly confidential and cannot find a suitable way to process it, resulting in idle data and the value of the data cannot be realized.
For example, a local government has detailed data on local residents and hopes to establish an intelligent infectious disease prevention and control system to prevent and control the epidemic. However, without technical support, it is difficult for the government to build the system on its own. If an external bidding company helps, there is a risk of leakage of residents' personal data. In order to avoid the risk of data leakage, the government does not use the data, and the data cannot play its due role.
If a third party that provides privacy-secure computing services is introduced between the two parties, data will not circulate directly between the two parties. The data owner will less likely take advantage of the data to gain dominance in the cooperation, and data circulation will be relatively safer.
Specifically, privacy-preserving and secure computing companies will provide the corresponding platform, and data providers will import data authorization into the platform for model evaluation and optimization. After completion, only the value of the data and the calculation results will be output to the data demander. During the entire process, the original data does not leave the privacy-preserving and secure computing platform, and the data is only authorized for use within the platform.
During the cooperation between the two parties, the emergence of privacy-based and secure computing companies can prevent data leakage. However, how to ensure that privacy-based and secure computing companies will not leak or abuse data?
Zhang Lintao, chief scientist of the privacy-secure computing company Yifang Jianshu, said that privacy-secure computing is still a technology in its early stages of development and there is still much room for improvement in all aspects. However, the industry has already taken a number of measures to protect the privacy of data information.
Taking Yifang Jianshu as an example, the data trained and optimized on its data platform are encrypted, and the keys are owned by the data owner, so Yifang Jianshu cannot obtain the data; secondly, Yifang Jianshu’s three mainstream secure computing methods of multi-party secure computing, federated learning, and trusted execution environment have all passed the certification of the China Academy of Information and Communications Technology, and the official endorsement proves its data security.
After seeing the value of privacy-preserving computing, many companies including Alibaba, WeBank, Ant Group, and Ping An Technology have actively deployed privacy-preserving computing and promoted the application of technology. According to the survey data of China Academy of Information and Communications Technology, about 44% of privacy-preserving computing products entered the implementation stage in 2021, and the proportion has further increased; the proportion of privacy-preserving computing products in the research and development stage has decreased relatively, accounting for 19%.
In the foreseeable future, privacy-secure computing may be deeply integrated with AI to help AI companies develop faster.
3
Yifang Health’s AI-based problem-solving method
As Zhang Lintao said, there are still many problems waiting to be solved in privacy-secure computing technology.
First, privacy-secure computing faces the problem of ecological barriers.
The technologies of companies in the privacy and security computing industry are not interoperable. After the data model is output on one platform, it cannot be reused on another company's platform, leading to the emergence of new "data island" problems.
Secondly, the current willingness and market for data transactions are not mature, which leads many companies to regard privacy-safe computing as a cost item for security and compliance. Only by deeply combining scenarios with privacy-safe computing technology and benefiting business parties from privacy-safe computing can the cost item be turned into a revenue item, thus stimulating the business parties' willingness to participate sustainably.
In fact, in the past many institutions have worked hard to promote national data transactions, but due to technical limitations, the results have been less than ideal.
If combined with privacy-secure computing, data transactions may become more efficient.
Leifeng.com learned that Yifang Jianshu is planning to launch an "AI Taobao" based on privacy-safe computing. Its Chief Marketing Officer Liu Shuo introduced that the platform can connect different AI demanders and suppliers, as well as data demanders and suppliers, so that companies with different capabilities in the AI industry chain can play their respective strengths and meet different needs.
Specifically, the platform integrates mainstream domestic AI tools, and AI companies and data participants within the platform can access AI capabilities. The biggest difference from other platforms is that the platform protects all data and AI models of the data source.
"Yifang Jianshu is a data intelligence company with zero data. It does not own data, but only provides tools to manage data. It also allows customers to process and handle data with authorization to obtain data value," Zhang Lintao introduced to Leifeng.com.
The reason why Yifang Health has such a plan is related to its many years of deep involvement in the industry and its long-term observation of the business development of companies in different fields.
Since its establishment in 2016, Yifang Health has been engaged in the research, development and application of privacy-safe computing. Currently, its business has expanded from medical care to government affairs, finance, marketing, science and other fields. In past cases, Yifang Health has used privacy-safe computing technology to solve practical problems in different scenarios:
Using privacy-safe computing technology, Yifang Jianshu helps companies with "drug-cell-gene" databases, such as Gewu Zhihe, to reach supply and demand cooperation with AI pharmaceutical companies and biomedical research and development technology companies, such as Suikun Intelligence, and helps data owners separate the right to use and ownership of data, and authorize external use with confidence; for bidding scenarios, Yifang Jianshu has built an AI verification platform that protects both the bidder's data and the bidder's AI enterprise model. Not only is it used in the bidding selection of AI demanders, the AI verification platform can also be used in technical competitions to achieve true "technical scoring" for AI.
Due to the complexity of implementation and delivery, huge computing volume, low customer acceptance, and the need for confidentiality at all stages, privacy-secure computing technology is only beginning to be applied to more and more fields. As time goes by and technology advances, there are still many scenarios where privacy-secure computing has the opportunity to flourish.
Take the automotive industry as an example. In the intelligent connected car industry that has emerged in recent years, many autonomous driving companies have provided assisted driving capabilities for car companies, such as Baidu Apollo and BYD, Momenta and SAIC. While the automotive industry is ushering in a new look, some people question that autonomous driving companies provide services to car companies, and may collect a large amount of user and road data through mass-produced cars, which may pose a risk of data privacy leakage. If privacy-safe computing is introduced in the process of cooperation between the two parties, it may prevent autonomous driving companies from obtaining sensitive user information.
4
Summarize
Professor Song Xiaodong, known as the "godmother of computer security", once publicly stated that all computing in the future will be privacy computing.
With the acceleration of digital transformation and upgrading in various industries, the driving role of data in industry development will become increasingly obvious, and at the same time, the flow of data will be subject to more restrictions.
Currently, many companies have proposed different technical routes to improve the security compliance of privacy-preserving and secure computing. As privacy-preserving and secure computing is gradually applied to more scenarios and current deficiencies are made up, privacy-preserving and secure computing may usher in a brighter future.
END