What kind of convenience will the development of voice technology bring to our lives?-EEWORLD

Collect

While the demand for voice industry is booming, the development speed of the industry will in turn be limited by the supply capacity of platform service providers. Looking beyond specific cases, the essential logic of the next step of the industry's development is whether the input-output ratio at each specific point reaches a generally accepted limit.

The closer to this boundary, the closer the industry will be to the critical point of snowball development, otherwise the overall growth rate will be relatively slow. Whether it is home, hotel, finance, education or other scenarios, if solving the problem requires a very high investment and a long period of time, the party that bears the cost will hesitate, which is equivalent to too high a cost of trial and error. If there is no perceptible new experience or sales promotion after the investment, the party that bears the cost will also hesitate, which will obviously affect the judgment of whether it is worth it. And these two things, in the final analysis, must be solved by the platform. The product or solution party is powerless to do anything about it. This is determined by the basic technical characteristics of intelligent voice interaction.

From the perspective of core technology, the entire voice interaction chain has five single-point technologies: wake-up, microphone array, speech recognition, natural language processing, and speech synthesis. Other technical points such as voiceprint recognition, crying detection, and dozens of other technologies are less universal, but they appear in different scenarios and become key in specific scenarios. It seems that the related technologies are relatively complex, but switching to a business perspective, we will find that finding these technologies is still a long way from creating a product with a good experience.

The development of voice technology will bring convenience to our lives

All voice interaction products are end-to-end products. If every manufacturer builds products based on these basic technologies, then each company must establish its own cloud service stability, ensure response speed, adapt to the hardware platform of its choice, and integrate specific content (such as music, audiobooks) item by item. This is unacceptable from the perspective of product or solution providers. At this time, the corresponding platform service provider will be born, which must solve problems such as technology, content access and engineering details at the same time, and ultimately achieve the goal of low trial and error costs and good enough experience.

China lacks a system platform provider with the dominance of Amazon. The current platform providers are divided into two camps: one is the traditional Internet or listed companies represented by Baidu, Alibaba, iFlytek, Xiaomi, and Tencent; the other is emerging artificial intelligence companies represented by SoundAI. Compared with traditional companies, emerging artificial intelligence companies have lighter historical baggage in products and services, so they can promote more future-oriented and distinctive basic services in platform services. For example, emerging companies will be more thorough in compatibility, which is very beneficial for a set of products to cover both domestic and foreign markets.

Compared with Android in the past, voice interaction platform providers are actually facing greater challenges, and the development process may be more tortuous. The concept of operating system, which was often mentioned in the past, is actually being given new connotations in the context of intelligent voice interaction. It is increasingly divided into two different but closely integrated parts.

In the past, Linux and its variants played the role of functional operating systems, while new systems represented by Alexa play the role of intelligent systems. The former completes the abstraction and management of complete hardware and resources, while the latter allows these hardware and resources to be applied in specific ways. The combination of the two can output end-user perceptible experiences. Functional operating systems and intelligent operating systems are destined to be a one-to-many relationship. Different AIoT hardware products have huge differences in sensors (depth cameras, radars, etc.) and displays (with screens, without screens, small screens, large screens, etc.), which will lead to the continuous differentiation of functional systems (which can correspond to the differentiation of Linux). This in turn means that a set of intelligent systems must simultaneously solve the dual responsibilities of adapting to functional systems and supporting different back-end content and scenarios.

There are huge differences in the properties of the two sides in terms of operation. Solving the former requires participating in the traditional product production and manufacturing chain, while solving the latter is more like an app store developer. There are huge challenges and opportunities in this. In the past, when building functional operating systems, domestic programmers played more of a user role, but although intelligent operating systems can also refer to others, this time they must build a complete system from scratch. (Foreign giants are actually very weak in both Chinese-related technology and content integration, and there is no possibility of invading the domestic market)

As the problems on both sides of the platform service provider are solved better and better, the basic computing model will gradually change, and people's data consumption model will be different from today. Personal computing devices (currently mainly mobile phones, notebooks, and Pads) will be further differentiated according to different scenarios. For example, in the car, at home, in the hotel, in the work scene, on the road, and in business handling, they will be differentiated according to the location and business. But at the same time, the services behind them are unified. Everyone can freely migrate devices according to the scenario. Although the services behind them will be optimized for different scenarios, they are unified in terms of personal preferences.

The interface between people and the digital world is now increasingly unified in specific product forms (such as mobile phones), but with the emergence of intelligent systems, this unity will become more and more unified in the system itself. As a result, this will lead to a continuous deepening of dataization, and we are getting closer and closer to a 100% digital world.

Reference address：What kind of convenience will the development of voice technology bring to our lives?

Previous article：What will be the development trend of speech recognition technology in the future?
Next article：Improving voice recognition technology is the key to the development of smart speakers

Popular Resources
Popular amplifiers