Decoding Tencent’s data migration method based on machine learning-EEWORLD

Collect

At present, in specific application scenarios, such as intelligent customer service application scenarios, intelligent customer service can analyze user characteristics, such as user interests, habits, and language patterns, based on existing user conversation records on the terminal side.

However, since the amount of user conversation records accumulated on the terminal side is very small, it is impossible to train the intelligent customer service, which means that the intelligent customer service cannot interact with users in a way that suits the user's characteristics. In this context, migration technology that migrates corresponding data from the server side to the terminal side to train the model on the terminal side came into being.

After anonymously aggregating the terminal-side data of all terminals on the server, model training is performed on the server side to obtain multiple general models. Then, through manual matching, a model that meets the terminal needs is determined from multiple general models, and then the corresponding data is migrated to the terminal based on the model to solve the problem that the amount of data on the terminal side is small and intelligent customer service training cannot be realized.

However, since the model trained on the server side is trained based on the terminal-side data accumulated by a large number of terminals, the trained general model cannot be fully adapted to a specific terminal, resulting in inaccurate data migrated to the terminal through the general model. In addition, manually determining the model that matches the terminal requires a lot of manpower and is inefficient.

Therefore, on July 15, 2019, Tencent applied for a patent entitled "Migration data determination method, device, equipment and medium based on machine learning" (application number: 201910637116.9), and the applicant was Tencent Technology (Shenzhen) Co., Ltd.

Based on the information currently disclosed in the patent, let us take a look at this method for determining migration data.

The above figure is a structural block diagram of a data migration system based on machine learning. This data migration system includes a terminal 110 and a data migration platform 140. The terminal is connected to the data migration platform via a wireless network or a wired network. The terminal installs and runs an application that supports data migration.

The data migration platform can be composed of one server, multiple servers, a cloud computing platform and a virtualization center. It is mainly used to provide background services for applications that support data migration. The data migration platform and the terminal can each independently undertake data processing work, or they can cooperate with each other for a more efficient combination.

The invention mainly involves migration data, for example, such migration data is mainly data provided and used as a service to terminal users. Taking the intelligent customer service scenario as an example, a migration model is determined on the cloud server, and data matching the terminal is migrated to the terminal based on the migration model. The terminal's terminal-side data and the migrated data are combined to train the terminal's intelligent customer service through machine learning.

This enables customized intelligent customer service to be provided to each user based on their interests, habits, and language patterns. For example, when a user initiates a conversation with the intelligent customer service on the terminal, the intelligent customer service will communicate with the user in a way that the user may be interested in, in line with the current user's habits, and in line with the user's language pattern.

The above figure is a schematic diagram of migrating data to the N+1th terminal based on the migration process data of the first N terminals. The data to be migrated is stored on the cloud server side. According to the needs of different terminals, the corresponding data is determined from the cloud server side and migrated to the terminal. The cloud server side analyzes the migration process data that has been migrated to the terminal, and trains the model on the cloud server so that the trained model can migrate the corresponding data for the specific terminal.

Obtain the migration process data of data migration to N terminals, where the value range of N is a positive integer greater than or equal to 1, and train the cloud server model based on the N migration process data. Finally, apply the trained model to the data migration process of the server to the N+1th terminal.

After obtaining the migration process data of the server migrating data to multiple terminals, the obtained migration process data is analyzed to determine the data migration performance index of each migration process data, and the model to be trained in the server is trained based on the data migration performance index data to obtain the migration model. The data to be migrated to the target terminal is determined from the general data on the server side through the migration model, and the data is migrated in response to the migration request of the terminal.

Next, let’s take a closer look at the flowchart of the program.

As shown in the figure above, this is a flow chart of the migration data determination method based on machine learning. First, the computer device obtains the migration process data of the server migrating data to multiple terminals respectively. The migration process data includes the migrated personalized data and the terminal side data of each terminal.

For example, in an application scenario of image recognition, a cat image is transferred to a dog image. Then, the data with common features between the cat image and the dog image are the image data of the eye part, the image data of the nose part, and so on.

Secondly, the computer device determines the similarity between the personalized data corresponding to each terminal and the terminal-side data of each terminal. In an ideal data migration situation, the migrated personalized data should have common features with the terminal-side data. Therefore, it is necessary to determine the migration performance of the migration process data. That is, the higher the similarity between the migrated personalized data and the terminal-side data, the better the migration performance of the corresponding migration process data. These data can be used as the training basis for subsequent model training to train and optimize the migration model on the cloud server side.

Next, the computer device determines the data migration performance indicators of multiple terminals based on the similarity, and trains the training model based on the data migration performance indicator of each terminal to obtain a migration model. Finally, the computer device responds to the migration request of the terminal and determines the data to be migrated to the terminal based on the migration model.

The trained migration model contains multiple trained neural network layers. The terminal side data of the terminal is input into the migration model, and the terminal side data is analyzed through multiple trained neural network layers. After analyzing the characteristic data of the terminal, it is matched with the general data on the cloud server side. Finally, the data migrated to the terminal is determined to ensure that the migrated data is the data required by the terminal, so as to generate a customized model for the user and meet the user's needs.

Finally, let's take a look at how the migration model is obtained here.

As shown in the figure above, first, the computer device compares the data migration performance indicators of each terminal with the corresponding migration parameters to be trained in the model to be trained, and determines the degree of difference between the data migration performance indicators of multiple terminals and the migration parameters to be trained.

Secondly, the computer device minimizes each difference degree to obtain corresponding migration configuration parameters, for example, training the model to be trained in the server from the N migration process data obtained previously.

Finally, the computer device configures the migration configuration parameters to the corresponding parameters to be trained in the model to be trained, and obtains the migration model. A migration model trained by machine learning can be considered to store the migration learning skills, that is, what kind of knowledge should be migrated from the server for what kind of user terminal data.

The above is the migration data determination method based on machine learning invented by Tencent. By obtaining the migration process data of data migration from the server to multiple terminals, the server-side model is trained based on machine learning to obtain a migration model customized for the terminal. The data required by the terminal is efficiently migrated to the terminal based on the migration model, which improves the accuracy and efficiency of the migration data and saves a lot of manpower!

Keywords：Tencent Reference address：Decoding Tencent’s data migration method based on machine learning

Previous article：Wingtech Technology purchased the remaining shares of Anshi and raised 5.8 billion yuan, which was approved by the China Securities Regulatory Commission
Next article：Shanghai Xinyang: The photoresist project is ready for pilot testing, and SMIC is an important customer of the company