For DuerOS, commercialization is not Baidu's goal
XGIMI Laser Screenless TV has released three new products equipped with DuerOS, namely T1, A1 Pro, and A1. The flagship T1 is priced at nearly 80,000 yuan. When this price was displayed on the presentation PPT, the loudest "wow" sound broke out at the scene. I believe that all the invited participants instantly understood why the theme of this press conference was set as "Redefining the Winner in Life."
Leiphone.com participated in the subsequent media interviews. When some media repeatedly asked questions about the price and the selection of cooperative customers, Jing Kun had to emphasize many times that DuerOS is "not picky" when choosing customers or service objects:
We will support big customers, we will support individual developers, we will support 80,000 yuan TVs, we will support 8,000 yuan TVs, and we will support 1,888 yuan TVs.
Jing Kun also said that after the Baidu AI Developer Conference, the DuerOS open platform was very popular and received inquiries and cooperation intentions from dozens of large and small companies and individual developers. In addition, Leifeng.com learned that he personally paid special attention to individual developers because these people would write to him, hoping to use DuerOS to do more things.
The reason why Baidu is not selective about its customers is that it hopes that AI can become a "cheap" technology or capability, open to all partners for free and become a standard feature. However, the free model will inevitably affect the commercialization of DuerOS. Hidden behind this is a more fundamental question - whether Baidu can become a profitable company with AI and return to the camp of the Big Three. Regarding this point, Jing Kun bluntly said:
Commercialization is not our goal now. The biggest problem now is the entire speaker industry: how to let ordinary consumers know that voice dialogue interaction is a standard feature.
Jing Kun firmly believes that as long as there is a leap forward in human-computer interaction, commercialization will not be a problem at all. But for Baidu now, this is not the focus.
The following is the transcript of Jing Kun’s interview. Leifeng.com has made some edits without changing the original meaning:
The only one in China that can solve the problem of understanding is Baidu
Reporter: Specifically speaking of XGIMI's T1, it has an area of 120 inches when projected, which means that the living room will be relatively large. Will it pose a challenge to far-field voice recognition?
Jing Kun: The far field of TV is challenging for current technologies, because the sound unit and audio characteristics of the TV itself still have certain challenges in the industry. But I think we are basically standing at the forefront of the entire technology to overcome such challenges, and we will release similar solutions later. The cooperation with XGIMI TV is mainly through the voice button on the remote control and near-field recognition.
In fact, when communicating with people in the industry, especially those in the TV industry, we all want to throw away the remote control. However, there are several types of TV users. Those who often watch TV are those who kill time. For example, we may want to watch "In the Name of People", but many users who kill time just want to browse, browse or see which poster is more beautiful. At this time, voice alone cannot fully meet their needs, so the remote control is still a relatively necessary device for them.
Reporter: So different options are provided for these two types of people?
Jing Kun: That’s right. As technology develops, far-field voice recognition will definitely become more and more common.
Reporter: In terms of technology, what difficulties are you trying to overcome?
Jing Kun: I think in terms of technology, near-field speech recognition is relatively mature, and far-field speech recognition needs to be gradually conquered in different environments. What everyone is focusing on now is the far-field speech of speaker products. I just mentioned the far-field speech of TVs and refrigerators. The voice of cars at a slightly closer distance is still a challenge.
Baidu is a company that is particularly technology-driven. I think the AI era is a particularly good opportunity for Baidu, because there are many problems that require technology support. Once technology is solved, it will provide great support. Therefore, we particularly hope to invest more in technology capabilities in the Baidu AI era, acquire good technology companies, and maximize our technological advantages.
Reporter: What are Baidu's advantages and disadvantages in terms of understanding?
Jing Kun: First of all, I am personally quite proud to say that I think Baidu may be the only (manufacturer) in China that can solve the problem of understanding .
If we look at it for a longer time, let's use the movie "Her" as an analogy, the artificial intelligence in that movie understands what we say; there is another movie called "Ex Machina", and there is a detail in it that the person who created Ex Machina is the founder of the search engine. That movie is very good, and I think everyone can go back and watch it.
This is because search engines have the biggest advantage in solving the problem of understanding - the advantage of data. Only when you see enough expressions can you know what the user means. For example, my son is now three years old. Many times when I tell him a noun or express a sentence to him, he doesn't understand it the first time. I tell him what durian is, and then I tell him this way of expression. After learning it the first time, he understands it the second time, so people will have the ability of transfer learning.
The same is true for computers. Asking computers to solve a problem they have never seen is actually a big challenge for them. The greatest ability of machine learning is to train on a limited set and solve problems on a set they have never seen. In the process of human-computer interaction, it is actually the same as when people used Baidu to search. From keywords to natural language, it is a match between expression and demand. In this regard, search engines have a very big advantage.
Commercialization is not Baidu's goal
Reporter: Xiaoyudaojia and JMI are both cooperating with DuerOS. Is there a priority order for selecting partners? What are the decision-making mechanisms and screening criteria?
Jing Kun: This is a good question. At the Baidu Developer Conference on July 5, I made an analogy that we hope DuerOS can become the Android of the AI era. In the process of the times, Android started from the bottom consumer layer, solved the basic threshold, and made it easier for many people to enter this industry and develop their own mobile phones. They can define their own mobile phones, and while solving some basic application needs, it also enriches the entire market.
We are actually playing the same role. At present, we do not pick and choose customers. For example, at the previous developer conference, Du Zhipeng, as an individual developer, did not have any corporate background. He just wrote us an email saying that he wanted to realize his dream, and we supported him. So now, we have developed many different kits and solutions on our official website. In fact, we hope that every small business, large enterprise and individual developer can develop such equipment.
Compared with large companies, the press conferences of small companies will have different attention. So whether it is Xiaoyu Zaijia or JIMI, we hope to help them to promote, because they are excellent companies in their respective fields. They have done very well in their own vertical categories, and I believe that after adding artificial intelligence technology, they can achieve the best. We hope to create high-end cases together, on the one hand to make their own products better, and on the other hand we can also cover more and more long tails.
I want to share with you something very interesting. After the recent developer conference, the DuerOS open platform has become very popular, whether in large enterprises, small enterprises or individual developers. I pay special attention to the group of individual developers. They will write to us and hope to use our platform to do many things. We will also share them with you in the future.
There is one example that I find particularly interesting. This individual developer is a high-voltage worker. His job is mainly on high-voltage wires, places that we can't imagine normally. He needs to wear gloves on the high-voltage antenna, so he can't do anything with his hands in his normal working environment, and he can't hold devices like mobile phones in his hands. But he actually has a lot of information needs, so he especially hopes to have a watch or bracelet that can talk to him, or meet his needs for making calls.
In all walks of life, there are some ordinary developers who also have such needs. We hope to build a basic open platform so that whether it is JmGO, Xiaoyu at Home, industry giants, or even individual developers, they all have the opportunity to obtain the solutions they want.
Reporter: The long-tail user group is actually quite small and may not have complete software and hardware capabilities. Has Baidu considered how to solve this problem?
Jing Kun: The openness we emphasize is to empower everyone. No one has served small users before, but they have strong demands. Serving them can see their real demands. Although the number of orders is small, each one reflects a very real corner demand. We also have an example. One user's eyesight is not very good. He had defects in obtaining information before. This empowerment can improve this situation. Although the case is small, it meets his needs to a great extent. So I think this is very meaningful for the basic ecology. But it is necessary to launch a software and hardware integrated solution, so we need a shell.
Reporter: After the AI Developers Conference, have any new cooperation intentions emerged?
Jing Kun: That day, I went with our product manager to meet a very important international customer. He was in charge of the speaker category. After seeing Alexa's solution, he took the initiative to find us after the developer conference and confirmed that the demand in this area was a blowout. So it can be seen that the AI Developer Conference was very effective. Whether it was speakers, TVs, refrigerators, or even car-mounted and watches, this type of users came in like snowflakes. Especially for the openness of our entire open platform, after we acquired KIIT.AI, the openness of voice wake-up hit their pain points. This type of customer demand is particularly high.
Now we have a problem, that is, the number of development kits is a little less than the initial estimate. From the perspective of the entire market environment, this demand is particularly large. Many people want to do voice interaction on new devices, even treadmills and massage chairs. We feel that our imagination is not enough. After the developer conference, they especially want to put voice interaction on their devices.
Reporter: After the developer conference ended, how many manufacturers came to you?
Jing Kun: At least dozens, a lot.
Reporter: Baidu's open platform customization business has a profit prospect. I would like to know to what extent Baidu customizes? To what detail? What is the willingness of your current customized customers to pay?
Jing Kun: Commercialization is not our goal now. The biggest problem now is for the entire speaker industry: how to let ordinary consumers know that voice interaction is a standard feature. And starting from this year, if you look back at similar smart TV launches this year, we cooperated with Guoan Guangshi in November last year. After the launch, all TVs this year, if they do not have voice interaction functions, feel that this TV seems to be missing something. This is the basic situation. So I think the main problem now is not the problem of commercialization, but to let ordinary users realize that this kind of interaction is a more convenient way.
Speaking of customization, we don't do a lot of customization. We are an open platform. Just like Android, many apps are not made by Google, but by third-party developers. We hope that partners will do it on our platform, providing them with voice recognition and semantic analysis. For them, the threshold for technical development is lowered, and they can do it themselves.
Baidu knows where its advantages are
Reporter: Baidu has a very strong team in AI technology. How do you distinguish which technologies are urgently needed to be acquired and which ones can be developed by your own strength?
Jing Kun: This is a very good question. We have an open mindset when it comes to technology. An open mindset not only allows us to see our own technology, but also the leading technology in the world, where the talents are, and where the technology is. Many excellent international startups are actually leading on a certain path, like KITT.AI. In the United States, Alexa is not allowed to be defined by others, and only Alexa can be used. We see that there is a strong customer demand in the Chinese market, and the wake-up word needs to be linked to the brand, which is very critical. At this time, we need to collect the best technology in the world and put it together. When looking at this industry, we will see which companies have similar technologies and have complementary technologies with ours, or if customers have particularly strong needs but we don’t have them yet, we will bring them in. This is probably the idea.
Reporter: Competition in the field of voice interaction is very fierce, and it is in a white-hot state. Amazon, Google, Apple, and Microsoft have all invested. What I want to know is, what role does DuerOS play in this market?
Jing Kun: All the companies you mentioned are from the Pacific Ocean, and they are all overseas. The companies you mentioned are the four giants on the technology list, the top four technology companies by market value. Everyone is looking at the same direction, which is a huge opportunity. I think it is a very good thing. I am particularly afraid that if we focus on one direction, such as Baidu investing a lot in voice and artificial intelligence, and others do not follow us, we will doubt ourselves. When others follow us, we must firmly believe that this is an opportunity.
Baidu is the largest search engine in Chinese. We know our advantages. In the era of artificial intelligence revolution, we hope that our advantage is the most basic ecosystem. We hope that after we have built these basic ecosystems, everyone will develop their own applications on this ecosystem and become more successful. We hope that everyone can be the output of this most basic service capability.
Practical training: far-field voice interaction technology
The CTO of Sound Intelligence Technology gave a lecture, analyzing the Echo speaker case with examples , gaining an in-depth understanding of the key technologies of voice interaction of AI devices, and mastering how to build a far-field voice recognition system ! Long press and identify the QR code below (or read the original text and click the link) to reach the detailed introduction of the course~