AI finds its voice in the audio industry chain

Publisher:EEWorld资讯Latest update time:2020-03-15 Source: EEWORLDKeywords:AI Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Translated from - embedded

 

The advent of silicon-based microphones has reshaped the audio landscape. Market research firm Yole Development believes that artificial intelligence will lead the market evolution and transformation in the coming years.

 

Voice interaction is natural, that's why it is becoming the main interface for human-computer interaction. Voice-based personal assistants (VPA) are becoming more and more popular in smartphones, smart speakers, smart watches, wireless headphones, cars, smart TVs and their remote controls. Now even trash cans have integrated voice recognition. The real value lies in the high audio quality and the understanding of the microphone's surroundings.

 

For Yole Development, audio is the next field to be infiltrated by artificial intelligence (AI).

 

How AI speaks

 

Voice-based virtual personal assistants (VPAs) are a major driver in today’s audio industry. Traditional components of audio systems, such as audio codecs, microphones, speakers, and audio amplifiers, are using AI to compute and analyze voice data. For example, computation supports complex audio functions such as speech recognition and source localization. It can be performed in the cloud or at the edge on consumer devices. These analyses require high processing power and access to large amounts of data to be performed in the cloud.

 

“The added value of AI is for natural language processing, and voice is a more natural way to interact with machines. You don’t need to use a keyboard and your hands, you just speak,” said Dimitrios Damianos, technology and market analyst at Yole’s Photonics and Sensing division. “However, a lot of processing is required to understand what the user is saying, their language, and what they mean. AI is adding value in decoding and helping us communicate with our devices.”

 

When asked about the rapid penetration of VPAs, Damianos attributed it to their convenience and efficiency. Of course, "what we believe and see is that large technology companies like Google, Apple, Facebook, Amazon and Microsoft (collectively known as GAFAM) are trying to push these VPAs because there is real value in the data they extract."

 

Alexis Debray, technology and market analyst at Yole’s MEMS and sensors division: Audio is more acceptable to users than images. They consider audio “less intrusive, so it’s a good way for GAFAM to collect data from people, and GAFAM’s main business is data. Some companies use data to make a business, while others use privacy and setting technologies to ensure users’ privacy. For example, Apple preaches privacy and makes it a strong marketing asset.

 

Damianos said: For large technology companies, their actual value is to extract as much information as possible from the environment, which means that VPA will not only listen to the user's voice, but also listen to the environment around them and understand their environment. For example, "if you are in the kitchen, the microphone can hear the sound of knives on the counter and immediately know that you are in the kitchen and give a recipe." This is contextual artificial intelligence.

 

The next stage after conversational AI will most likely be full awareness, where a virtual assistant, whether it’s a smart speaker or a smartwatch, can communicate with the user like a human. Damianos said full awareness is conceptual and comes with a question mark. “We don’t know the timeline yet, but it will probably be 5 to 6 years after conversational AI. It will depend on the development of AI and the growth of companies in the field.”

 

While these always-listening systems can save lives in automotive human-machine interfaces, they also raise concerns about user privacy protection. To prevent possible misuse, Debray stressed that data processing should be done as quickly as possible and as close to the microphone as possible. "The closer you are to the microphone, the less likely it is that there will be a privacy breach."

 

Privacy encompasses multiple dimensions, as users may wish to hide their gender, age or emotions. Looking ahead, Debray said he is confident that players in the microphone, ASIC or application processor space will develop technology that ensures user privacy. Microphones can remove emotions from voices and present only audio data.

 

 

Yole analysts expect GAFAM to continue to dominate, as their role in analytics is currently essential, but sensor manufacturers are clearly eager to add AI at the edge and offload audio analytics from the cloud. Damianos: "Sensor manufacturers want to increase their revenues and get a piece of the audio market. This is not a battle from the big companies' side. This is a battle from the sensor companies."

 

Alexis Debray: Sensor companies are really pursuing diversification strategies, “trying to move up the value chain and become more integrated.”

 

In a recent interview, Vesper Technologies Inc. CEO Matt Crowley said that Vesper is looking to make its piezoelectric MEMS microphones smarter. "We believe that in the future, we will have some sensors paired with artificial intelligence embedded in the sensor. It will be able to learn how humans and animals use their senses - not just vision, hearing, taste, smell and touch, but also motion or temperature - to understand their environment. Our long-term vision is that these objects will use multiple biomimetic sensors to understand their environment and respond in the best way possible."

 

In addition, Infineon AG has also changed its business model from selling microphone modules to selling products from companies such as Goertek and AAC, and then to selling complete MEMS microphones. From a MEMS microphone manufacturer to a comprehensive player that does manufacturing, packaging, testing and sales. "This is a strategic change... It may mean that they see the trend of VPA and hope to find their own position in this market."

 

Likewise, Knowles Electronics, the leader today with a 39% share of the MEMS microphone market, recently acquired the MEMS microphone ASIC design division from Ams AG. This was a way to bring in mixed-signal circuit design intellectual property, but also a way to counter growing competition from Chinese companies such as Goertek and AAC.

 

MEMS microphones provide strong support for sound quality

 

The global audio market continues its growth trajectory. Voice-based VPAs require better signal-to-noise ratio (SNR) to accurately capture human voices in noisy environments, so MEMS microphones face new market opportunities.

 

According to Yole, the global consumer market for microphones, microphones and audio chips will grow at an annualized rate of 6.6%, from $14.1 billion in 2018 to $20.8 billion in 2024. Cheap, small, and easy to integrate, microphones are widely adopted and reach very high volumes. Damianos: "We use about 6 billion microphones," and the microphone market is currently worth $1.7 billion and is expected to grow at a compound annual growth rate of 3% to $2 billion by 2024.

 

The MEMS microphone market currently accounts for about 70% of the total market and will grow from $1.2 billion in 2018 to $1.6 billion in 2024. The main driving markets include smartphones, smart speakers, and hearable devices (such as wireless headphones). Damianos said: "In the past few years, the market for smart speakers and hearable devices has experienced explosive growth. By 2024, the compound annual growth rate of MEMS microphones in smart speakers will reach 13%, reaching 1.2 billion units. By 2024, the compound annual growth rate of wireless headphones will reach 29%, reaching 1.3 billion units.

 

20% of smartphone users’ interactions with computers will use voice assistants in 2019

 

Major mobile phone platforms are also actively investing in making virtual personal assistants (VPA) increasingly powerful. Even Apple's Siri can sing PPAP. ​​Research firm Gartner said that related technological advances will prompt users to use VPA more frequently. It is expected that by 2019, 20% of human-computer interactions among smartphone users will use VPA.

 

Gartner also released a mobile application survey for the fourth quarter of 2016, which surveyed 3,021 mobile phone users in China, the United Kingdom and the United States. It found that 42% of users in the United States and 32% in the United Kingdom had used the VPA function in the past three months, and 37% of users in the United Kingdom and the United States used it at least once a day.

 

54% of the surveyed users in the UK and US have used Apple's Siri in the past three months, while the usage rate of Google Now in the US and the UK is 48% and 41% respectively. Gartner said that with the emergence of more new features, more language support, and more and more models supporting VPA, the usage rate of this type of interface will gradually grow.

 

Gartner said that the growth of VPA will complement the rapidly developing "conversational commerce" at the same time, and will not only be centered on voice recognition, but the voice dialogue function of communication software will also play an important role. For example, Facebook Messenger's new business function allows users to order products and call Uber through conversations. In addition, the payment function developed by Tencent's WeChat has also become an important function of the communication service.

 

Gartner said that China is currently the most mature market for conversational commerce in the world, and the market development centered on communication platforms has prompted traditional businesses to change. Microsoft's Cortana is also being integrated into Skype. Cortana is responsible for acting as an interactive platform between third-party service providers and consumers, presenting a dialogue, to assist users in business activities such as air ticket and hotel reservations.

 

In addition to voice, Gartner also predicts that the touch function, which is currently the mainstream interactive interface of mobile phones, will no longer be the only interface that consumers rely on. The importance of voice and gesture will increase significantly in consumer devices. It is expected that by 2020, there will be 7 billion personal devices, 1.3 billion wearable devices and 5.7 billion other types of consumer IoT terminal devices, which will have only basic or no touch design (Zero-Touch UI). The use of sensors to collect contextual information about the surrounding environment, such as voice, environment, biometrics, movement and action information, will become an important design basis for new UI.

[1] [2]
Keywords:AI Reference address:AI finds its voice in the audio industry chain

Previous article:Apple AirPods are unstoppable, with shipments expected to reach 90 million units in 2020
Next article:AirPods are selling like crazy. How much profit is behind the TWS industry chain?

Latest Internet of Things Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号