Baidu Translate's 10 years: The number of languages translated has exceeded 200 for the first time, the quality has improved by 30 percentage points, and more than 100 billion characters are translated every day
Jin Lei from Aofei Temple
Quantum Bit Report | Public Account QbitAI
How can a machine translation system change in ten years ?
In 2011, he only knew the skill of "Chinese-English" translation. But he spent ten years honing himself in translation.
Nowadays, TA's "opening method" is like this:
The world's first Internet neural translation system was released, which improved translation quality by 30 percentage points (BLEU, an internationally commonly used evaluation indicator) within 10 years , and usually an improvement of 1 percentage point is very significant.
For the first time in the world, we have been able to translate between more than 200 languages, increasing the number of translated languages 100 times in 10 years .
It is no longer just a skill of text translation, but cross-modal translation such as image, video, document and even simultaneous interpretation has been mastered.
It is no longer limited to entering text in an input box to translate, but there are many products such as translation APP, AI simultaneous interpretation conference version, simultaneous interpretation assistant, mini program and translation open platform.
And now it has become extremely busy. The number of characters translated by it every day exceeds 100 billion , which is equivalent to 2,000 copies of the Encyclopedia Britannica and 100,000 times that of 10 years ago .
Even Gartner made this comment:
It is a benchmark institution for neural network machine translation and the only shortlisted institution in China.
It is an important force in global AI translation services.
…
TA, that is Baidu Translate .
But if you still think it is just a translator, that may be a bit one-sided.
Because the current Baidu translation has changed a bit.
What does Baidu Translate look like when it’s 10 years old?
If we say that the starting point of Baidu Translate was the website that only translated between Chinese and English ten years ago.
Now, it can be said that it has taken translation to a new level.
First of all, in terms of translation languages, as mentioned earlier, Baidu Translate is the first system in the world that can translate between more than 200 languages.
And it is not just a matter of the large number of translation languages, but also reflected in the difficulty of translation.
For example, it even dabbles in some "unpopular" languages. Take the quintessence of Chinese classical literature as an example, and input a passage from "Learning Chess":
Yiqiu is a chess expert in the country. Let Yiqiu teach two people how to play chess. One of them is concentrating on listening to Yiqiu's instructions. The other one, although listening, thinks that a swan is coming and wants to shoot it with a bow and arrow. Although he is learning with Yiqiu, he is not as good as him. Is it because his intelligence is not as good as his? Answer: It is not true.
In a snap, Baidu Translate can instantly present the obscure ancient text in plain language:
However, it is not easy for machines to do this because, apart from major languages, the resources for mutual translation between most languages are scarce, and there is not enough knowledge for AI to learn.
But Baidu Translate is not only satisfied with the "extensive" and "precise" translation of texts, it has also spent ten years working hard on convenience .
Just recently, Baidu Translate App was updated to version 10.0, and its “fancy translation” was well reflected here.
It is no longer a simple routine of inputting text and translating it, but voice, pictures, videos, documents and other forms are also integrated into it.
In other words, if you want to translate now, it is no longer a single form of inputting text.
Just say a word, take a photo, or even import the complete document directly to complete the translation.
Not only that, Baidu Translate can also easily handle high-level translations such as simultaneous interpretation .
Baidu Translate even won first place in Chinese-English translation in the world's top machine translation competition WMT (Workshop on Machine Translation) .
It is not difficult to see that Baidu Translate has spent ten years not only expanding horizontally, but also "self-improving" each product vertically. The Big Family has now flourished.
So how did Baidu Translate advance to this level in ten years?
The Evolution of Baidu Translate
Let’s first take a brief look at the development of machine translation.
The concept of "machine translation" was first proposed by Warren Weaver, an American scientist and pioneer of information theory, in 1946, one year after the birth of the first computer ENIAC:
From then on, machine translation first entered the era of "rule-based method" .
This method essentially writes down the translation knowledge of experts in the form of rules, and then uses software to use the translation rules to implement the machine translation process.
However, the disadvantages of this method are also obvious, that is, the construction cost and maintenance cost are too high, and the entire program needs to be rewritten at any time.
In the late 1980s and early 1990s, IBM proposed another method of machine translation - statistical machine translation , which opened the second door to the era of machine translation.
Unlike rule-based machine translation, statistical machine translation no longer requires manually written translation rules, but instead switches to a data-driven machine learning approach.
The biggest advantage is that the machine can "self-learn" according to manually defined features, while the previous rule-based methods require the hands-on guidance of human experts.
When Baidu Translate was first launched, it mainly used methods based on statistical machine translation. At the same time, it developed a multi-strategy model that integrated existing methods in order to cope with the complex and diverse translation requests on the Internet.
In 2010, Baidu Translate established its own R&D team, and just one year later, it launched the web version.
However, statistical machine translation has been around for more than 20 years, and its development bottleneck is becoming increasingly obvious. After a series of technical iterations such as phrase-based methods and syntax-based methods, statistical machine translation has gradually encountered a ceiling, and the translation quality is difficult to further improve, especially in long-distance resequencing and translation fluency.
Even if you have to cross the river by feeling the stones, you must lead by example
In 2013, a research paper titled "Recurrent Continuous Translation Models" came out.
With the new methods proposed by researchers, machine translation has entered the era of neural machine translation (NMT) .
Although this neural network approach is indeed an ideal "alternative", very realistic problems also lie ahead for the Baidu translation team.
That is, there is "no reference". The modeling method is completely new and there is no experience to follow.
Furthermore, with the level of technology at the time, machine translation through neural network models was still a very resource-intensive task.
The price of improving translation results is the consumption of a large amount of computing resources. It often takes more than ten seconds to translate a sentence.
Fast forward to 2015. Even in this context, the Baidu Translation team still made a "daring and pioneering" decision:
Launched machine translation based on neural networks.
In terms of technical methods, the Baidu translation team incorporated the features of the previous generation of statistical machine translation to address the shortcomings of NMT.
Specifically, it is to integrate the n-gram language model, phrase table features, length features, etc. into the NMT model.
Experimental results show that this "combination of the new and the old" method significantly improves the performance of NMT in Chinese-English translation.
From project establishment to release of the world's first Internet neural network machine translation system, Baidu Translate took less than half a year.
This pace is a full 16 months ahead of Google Translate.
However, Baidu Translate is not satisfied with this.
△
Bruno Pouliquen, Head of Machine Translation at the World Intellectual Property Organization, MTSUMMIT-2017
We need to be the “leader” in more directions
In order to further translate into more languages, Baidu Translate also proposed "Multi-Task Learning for Multiple Language Translation".
In this study, Baidu Translate proposed
a
shared encoder
multi-task learning neural network translation model and established a unified framework for multi-language translation based on neural networks.
△
Translation model diagram based on shared encoder
This is also the key reason why Baidu Translate can now support translation between 203 languages.
In 2017, Baidu Translate surprisingly unveiled its AI simultaneous interpretation feature.
Specifically, a semantic unit-driven machine simultaneous interpretation model was proposed, which solved the difficult problem of balancing translation quality and simultaneous interpretation delay.
At the same time, the Baidu translation team has also developed a high-quality, low-latency machine simultaneous interpretation system with a translation accuracy of over 80% and an average time delay of 3 seconds.
It is also because Baidu is in the leading position in machine translation technology and has a high translation accuracy.
Therefore, many international conferences and events have chosen Baidu Translate as their technical support. Baidu Translate's AI simultaneous interpretation has even made its way into important events such as the China International Fair for Trade in Services and the China International Import Expo.
…
Then a question that arises is:
Why is Baidu putting so much effort into translation?
Translation is more than just a tool
First of all, one thing that needs to be made clear and reached a consensus is that machine translation is one of the ultimate goals of artificial intelligence and one of the most challenging applications of AI technology.
This is why Baidu continues to innovate in the field of machine translation.
But from another perspective, what Baidu Translate has to do is never as simple as translation itself.
Judging from its ten-year development history, Baidu Translate has changed:
It is not just a tool, but also a bridge , a window and a sensor of world culture .
How should this be understood?
We might as well understand its "change of taste" by looking at what Baidu Translate has brought .
TA is the translation assistant around the user
For example, in the process of traffic police enforcement, this situation has occurred with foreign (Russian) friends.
Since they don't speak Chinese, communication becomes a big problem.
In the end, the traffic police successfully rescued the foreign crew members through the ability of Baidu translation.
For example , at work , language barriers become an obstacle to information acquisition and communication.
By using Baidu Translate software, users can make cross-language communication smoother.
But such services and experiences should be available to everyone, even people with disabilities.
To this end, Baidu Translate also helps visually impaired developers develop software for blind people to operate , helping a large number of blind users to obtain translation services for free.
It is precisely these true stories that make Baidu Translate no longer just a translation tool, but also give it meanings such as bridge, window and receptor.
TA helps the world fight the epidemic
But to be honest, compared to this "change" in experience, Baidu Translate is gradually playing a deeper and more grand mission and value.
For example, in the fight against the epidemic, Baidu Translate is also playing its role.
The French 3M mask instruction manual, the English protective clothing instruction manual, the Russian three-layer mask commodity inspection certificate... these anti-epidemic materials, etc., all require translation work.
But as we all know, fighting the epidemic is not only a heavy task, but also a race against time.
Baidu Translate took on the heavy burden of translation work during the anti-epidemic period. In just two days , it built an efficient and easy-to-use customized translation tool and quickly opened it to the volunteer team for free.
△
Multilingual epidemic prevention video
TA serves national needs and paves the way for cross-language communication
Moreover, what Baidu Translate does is in line with national needs.
The country proposed at the second "Belt and Road" International Cooperation Summit Forum:
The key to jointly building the Belt and Road is connectivity. We should build a global connectivity partnership to achieve common development and prosperity.
Cross-language communication has become the key to achieving this goal.
In the past ten years, Baidu Translate has gradually expanded cross-language translation along the countries along the Belt and Road, with the number of translated languages increasing 100 times.
It is not difficult to see that this is also in line with the general trend of interconnection and connectivity in national and even global development.
At present, Baidu Translate is still changing its nature, and is trying to transform translation into a kind of productivity .
But after all, translation is a long and arduous task. Even Baidu Translate, which has won many "world firsts", still has a long way to go.
As for what kind of improvements Baidu Translate will bring in technology and value in the future, we will wait and see.
-over-
This article is the original content of [Quantum位], a signed account of NetEase News•NetEase's special content incentive plan. Any unauthorized reproduction is prohibited without the account's authorization.
The "Smart Car" exchange group is recruiting!
Friends who are interested in smart cars and autonomous driving are welcome to join the community to communicate and exchange ideas with industry leaders, so as not to miss out on the development and technological progress of the smart car industry.
ps. Please be sure to note your name, company and position when adding friends~
click here