Understanding Artificial Intelligence Voice Chip in One Article
Source: The content comes from "
Jiuding Investment
", author: Meng Wei, Feng Zhuo, thank you.
Industry Overview
Industry Introduction
Artificial intelligence chips (AI chips for short) refer to chips that contain modules that specifically handle a large number of computing tasks in artificial intelligence applications. They belong to the intersection of integrated circuits and artificial intelligence. Since 2016, Internet giants such as Google, Baidu, Alibaba, and Tencent, as well as many well-known venture capital funds, have been rushing into the artificial intelligence industry, vigorously promoting the commercialization of various start-up algorithm (solution) companies in multiple application fields. With the continuous emergence of clear commercial applications of artificial intelligence in the fields of visual recognition and speech recognition, the trend of chipization and hardwareization of artificial intelligence algorithms and solutions has become increasingly obvious, and the demand for chips with deep learning algorithm acceleration functions has increased rapidly.
The rapid development of the field of artificial intelligence depends on three key factors: research on cutting-edge algorithms, active exploration of landing application scenarios, and meeting increasingly clear business needs through advanced chip design and manufacturing technologies. Many Chinese companies such as SenseTime, Megvii, iFlytek, AISpeech, Huawei, and Cambricon have been at the forefront of the world in the research of cutting-edge algorithms, keeping pace with the United States. China is even more world-leading in the exploration of landing applications, and has already seen a number of landing business scenarios including face recognition, smart security, smart speakers, and smart homes. The new generation of AI chip startups have combined application characteristics and adopted reasonable advanced chip design and manufacturing technologies, which put them on the same starting line as similar American companies. The industry generally believes that the AI chip competition is a competition between Chinese and American companies, and China is currently lagging behind. However, due to its more active exploration of applications, Chinese AI chips may surpass its competitors in the future and become a global leader.
AI chips can be divided into cloud chips and terminal chips according to application scenarios, and can be divided into training chips and inference chips according to functions. The cloud refers to the server side, and the terminal refers to electronic terminal products including mobile phones, computers, surveillance cameras, home appliances, consumer electronics, etc. Training refers to the process of training a complex deep neural network model through a large amount of data input and the use of unsupervised learning methods such as reinforcement learning. It has high requirements for the comprehensive performance of chip computing and storage. Inference refers to the process of using the trained model and new data to draw various conclusions and complete various tasks. It has high requirements for chip speed, energy consumption, security and hardware cost.
According to the above two dimensions, AI chips can be divided into four quadrants. Among them, terminal/embedded devices are mainly used for inference applications, and the demand for training is not yet clear. However, future terminal devices will gradually have the ability to train and learn.
Figure 1: AI chips are classified according to two dimensions
Source: Tsinghua University, Beijing Future Chip Technology Advanced Innovation Center
development path
Since the birth of chips, human beings' continuous exploration of chip design, high-purity silicon technology, ultra-high precision equipment, and physical and chemical processes has promoted the rapid development of chip technology. The computing power of chips has grown exponentially, promoting the birth and development of the personal computer and Internet era. In 2006, a new generation of artificial intelligence marked by deep learning algorithms was born. In 2016 and 2017, Google's artificial intelligence AlphaGo, based on deep learning technology and running on TPU (ASIC), defeated human Go world champions Lee Sedol and Ke Jie successively, triggering a new round of artificial intelligence development boom. Computing power has once again become the core driving force of the artificial intelligence era. AI chips are the concentrated embodiment of the improvement of computing power and have become the foundation of the "brain" of artificial intelligence.
From 2010 to the present, chips with different functions and positioning have begun to be developed for deep learning algorithms, and a prosperous AI chip industry with GPU, FPGA, ASIC, etc. has been initially formed.
Industrial chain and core links
The design and manufacturing process of AI chips is similar to that of other chips, and consists of design, manufacturing, packaging and testing. The downstream of AI chip design companies are application solution companies, which will ultimately use their overall solutions (including various software and hardware running on the cloud and terminals) for specific application scenarios. Currently in the early stages of development in the AI field, Chinese algorithm companies have invested a lot of money and energy in exploring practical applications, and algorithm companies and application solution companies have actually become one. Specifically:
In the downstream application link, algorithm and solution companies formulate chip layout plans for the cloud and terminals according to different application scenarios. Among them, the cloud trains reliable artificial intelligence algorithms for application scenarios and undertakes most of the complex reasoning tasks, and the terminal directly outputs results for the application scenarios. At present, artificial intelligence has been rapidly implemented in security, finance, medical care, education and other fields and has formed scalable cases.
In the upstream link, different artificial intelligence algorithms have different requirements for the acceleration function of AI chips, requiring AI chip design companies to design chips with specific acceleration functions according to the characteristics of the algorithm. According to the division of reasoning and training, the operations required for the training link include forward calculation and backward update, and the inference link is mainly forward calculation. Forward calculation includes matrix multiplication, convolution and loop layer operations, etc., and backward update is mainly gradient operation. Therefore, the operation characteristics of the two links are not the same. The cloud focuses on training, the algorithm is more complex and extensive, and the versatility and comprehensive performance of the chip are higher; the terminal focuses on reasoning, the algorithm is more efficient and reliable, and the chip's specificity and efficiency are higher. At the same time, because the chip has hardware properties that are basically unchangeable once the design is completed, the algorithm company needs to combine the development status of the algorithm and select the appropriate chip architecture and software and hardware function division under the existing AI chip technology conditions.
In general, algorithms and AI chip design influence each other. As an AI chip company, it is necessary to fully understand and master the rules of algorithms and design AI chips that meet development needs; as an algorithm company, it is necessary to realize the evolution of algorithms based on various AI chips, especially dedicated chips.
Figure 2: Schematic diagram of the AI chip industry chain
Source: Compiled by JD Investment
Industry barriers
The AI chip industry is an intersectional industry of artificial intelligence and integrated circuits. It has dual attributes and high industry barriers.
First of all, there are two barriers in the artificial intelligence industry. The first barrier is algorithms. Only by mastering deep learning algorithms and having the ability to continuously update algorithms can one truly enter the door of artificial intelligence. The second barrier is the understanding of application scenarios. Only by implementing algorithms in specific application scenarios and forming a closed loop of data plus algorithms can one gain an advantage over competitors by feeding back algorithms through data.
Secondly, the chip design industry faces comprehensive barriers in terms of technology, talent, experience, and capital. The chip design industry chain has many links, requiring multi-faceted, multi-level, and multi-gradient talent and rich experience in mass production engineering. AI chips that require advanced processes have tape-out costs of tens of millions, and product development cycles of up to 1-2 years, so the entry barrier for new companies is very high.
AI chip landscape and major companies
At present, in the training stage, due to the need for more efficient large-scale parallel computing, NVIDIA GPU occupies a monopoly position, and ASIC solutions such as Intel CPU+FPGA, Google TPU and Cambrian MLU are also accelerating to catch up. In the reasoning stage, the computing power performance of different heterogeneous chips or dedicated chips is in the process of development and change. In terms of cloud reasoning, Intel's CPU+FPGA architecture has powerful capabilities, and NVIDIA has greatly improved the reasoning performance of its GPU through the Volta architecture; in terms of terminal reasoning, because it is closer to the final application, the algorithms of different market segments are quite different. ASIC has become the mainstream choice due to its strong specialization, high efficiency and low power consumption, while FPGA is suitable for terminal fields where algorithm solutions change rapidly.
Overall, the industry landscape may be as follows in the future: GPUs are used in high-end complex algorithms, high-performance computing and data centers; ASICs are widely used in cloud training, reasoning and smart terminals; FPGAs are used in rapidly changing industry applications and virtualized cloud platforms.
Table 1: Comparison of four types of artificial intelligence chips
Source: Compiled by JD Investment
GPUs and FPGAs
1. GPU
GPU (Graphics Processing Unit) is a microprocessor that performs graphics computing. With the development of general computing technology, the function of GPU is no longer limited to graphics processing, and it has begun to be widely used in high-performance computing such as floating-point computing and parallel computing. It currently supports the acceleration of more than 150 applications in multiple fields such as financial engineering, meteorological and ocean modeling, data science and analysis, national defense and intelligence, manufacturing (CAD drawing and CED), imaging and computer vision, medical imaging, electronic design automation, computational chemistry, etc. However, due to its high power consumption, it is mainly used for cloud computing.
GPU is currently the preferred chip for deep learning algorithm training and has the highest market share in this field. It has a complete artificial intelligence computing software ecosystem, and more and more deep learning standard libraries support GPU-based deep learning acceleration. Compared with CPU, GPU is suitable for intensive programs and parallel computing, while CPU is good at logical operations and serial computing.
(1) NVIDIA
Nvidia's GPU products mainly include PC processors GeForce, mobile processors Tegra and deep learning chips Tesla. Tesla's core products include a series of chips based on the PASCAL architecture and Volta architecture.
At present, NVIDIA's GPU products are mainly used in various computing platforms, data center acceleration and deep learning training, and the application fields include medical, automobile, smart home appliances, financial services, etc. Based on the Tegra series processors, NVIDIA released the DRIVE PX open artificial intelligence vehicle computing platform, which can realize automatic cruise functions including highway automatic driving and high-definition mapping. The Tesla Model S using this platform has begun mass production. Baidu and Volvo have also reached cooperation with NVIDIA, and they will all produce smart driving cars equipped with DRIVE PX.
The Telsa V100 launched by Nvidia in May 2018 has a 1.5-fold increase in floating-point computing speed, a 12-fold increase in deep learning training speed, and a 6-fold increase in inference speed.
(2) ATI (acquired by AMD)
ATI is a graphics card manufacturer as famous as Nvidia. It was acquired by AMD for $5.4 billion in 2006. In August 2017, AMD officially released a new generation of GPUs equipped with deep learning functions, which outperformed Nvidia's Pascal series in various tests and applications. In 2018, AMD publicly demonstrated the world's first 7nm GPU chip prototype. In general, AMD is not as good as Nvidia in terms of product ecology and market share, but it is still the world's second largest GPU manufacturer after Nvidia.
(3) Jingjiawei
Jingjiawei is the only graphics processing chip company in China with independent intellectual property rights and mature products. The company was founded in April 2006 and listed on the Shenzhen Stock Exchange in March 2016. It currently has more than 400 employees. The company's innovative MPPA architecture provides a single-chip supercomputing solution with high performance, low power consumption, and real-time performance. It can achieve real-time acceleration for cloud computing applications in the fields of video, network, telecommunications, big data, etc., and can also provide embedded high-performance computing capabilities for embedded applications in the fields of aerospace, national defense, and automobiles. However, Jingjiawei has a large technological gap with foreign GPU giants, and it is unlikely to affect the industrial structure of artificial intelligence GPU chips in the short term.
2. FPGA
FPGA is a field programmable gate array. The characteristics of high-density computing, high throughput and low power consumption give it a large development space in various industries. In the field of communications, FPGA is mainly used in communications and wireless equipment systems, providing data centers with higher energy efficiency, lower costs and higher scalability, and can also be used for 5G programmable solutions; in the industrial field, FPGA can realize automation, machine vision and motion control; in the automotive field, FPGA has become the main processing platform for ADAS, providing real-time image analysis and intelligent transmission. Because FPGA is programmable, it has great advantages in providing differentiated products and fast response. In addition, the hybrid structure of CPU+FPGA can also be used for cloud service computing.
The FPGA market is developing rapidly, but the technical threshold is relatively high. Currently, the market is mainly dominated by two companies, Xilinx and Altera, with a combined market share of more than 80%.
(1) Xilinx
Xilinx is the world's number one supplier of complete programmable logic solutions. Founded in 1984, Xilinx pioneered the field programmable logic array (FPGA) technology and launched its first commercial product in 1985. Xilinx develops, manufactures and sells various types of integrated circuits, software design tools, and IP (Intellectual Property) cores as predefined system-level functions. Xilinx products have been widely used in digital electronic application technologies from mobile communication base stations to DVD players. As the inventor of FPGA technology and an industry-leading company, Xilinx accounts for about 50% of global FPGA market shipments and has a significant advantage in the high-end FPGA market (16nm, 20nm, 28nm). The company has more than 7,500 customers worldwide, including well-known companies such as IBM, NEC, Samsung, Siemens, and Sony.
(2) Altera
Altera has long been a leader in the FPGA field and is another FPGA oligopoly besides Xilinx. Altera's FPGAs are divided into two categories: one focuses on low-cost applications, has medium capacity, and performance that can meet general logic design requirements, such as Cyclone and CycloneII; the other focuses on high-performance applications, has large capacity, and can meet various high-end applications, such as Stratix and StratixII. Altera's FPGA products are widely used in many fields such as automobiles, consumer electronics, military aviation, medical, and wireless communications.
Intel spent $16.7 billion to acquire Altera at the end of 2015. Intel plans to integrate Altera's customizable chips with its own standardized semiconductors to create more efficient product solutions for specific tasks such as web search and machine learning.
(3) DeePhi Technology (acquired by Xilinx)
DeePhi Technology provides AI acceleration solutions based on FPGA platforms and was acquired by Xilinx in August 2018. DeePhi Technology has a leading edge in the fields of deep neural network compression, instruction set and computing architecture. Its paper on deep compression was listed as the best paper of ICLR2016 together with Google DeepMind's paper. At the 2016 Open Power Summit, most of the technical parts of the new methods of deep learning processors introduced by the world's largest FPGA manufacturer came from DeePhi Technology. DeePhi Technology's FPGA-based DPU products can provide deep learning acceleration solutions for multiple industries. Compared with general-purpose products such as CPU and GPU, it has higher energy efficiency and has been applied to security, big data and other industries.
Other Chinese FPGA chip companies, including Beijing Microelectronics, Gocloud, Anlu, and Zhiduojing, have generally not been able to mass-produce high-performance FPGAs, and are unlikely to affect the industrial landscape of artificial intelligence FPGAs in the short term.
ASIC
ASIC (Application Specific Integrated Circuits) refers to integrated circuits designed and manufactured for specific needs. Neural network processors are the application form of ASIC dedicated circuits in the field of artificial intelligence. At present, the competition for AI chip applications in the fields of GPU and FPGA is fierce among international leading chip manufacturers. With the rise of terminal artificial intelligence applications in the future, ASIC chips customized for deep learning algorithms are much better than GPU and FPGA in computing speed and power consumption. With the accelerated penetration of artificial intelligence into the industry, ASIC will be widely used in security, smart terminals, finance, and Internet of Vehicles in the future. The vast market space makes large-scale application of ASIC possible. It can be foreseen that dedicated AI chips (ASIC) will become the main battlefield for new AI chip manufacturers to compete with traditional giants. At the same time, my country's dedicated AI chip companies are not far behind the world's leading level, and some fields are at the forefront of the world. ASIC will become the key to my country's chip industry overtaking on the curve.
At present, there are some ASIC chip companies for terminal artificial intelligence in China, which can be roughly divided into four categories: the first is the chip design teams of Internet and communication giants; the second is mature chip design companies that have existed for many years; the third is the newly established AI chip startup teams/companies; the fourth is the algorithm companies that extend to AI chips.
1. Internet and communications giants
Giant companies represented by Huawei and Baidu have obvious advantages in algorithms and data. In order to extend the implementation of AI applications, they have accelerated their layout on the chip side, but mainly focused on cloud chips.
(1) Google
Google's TPU (Tensor Processing Unit) is a dedicated accelerator chip that matches its deep learning software Tensor Flow. TPU is tailored specifically for machine learning and requires fewer transistors to run a single operation. Its development purpose is to replace GPU and achieve more efficient deep learning.
TPU is not designed just for a certain neural network model, but is capable of executing CISC (complex instruction computer) instructions in a variety of neural networks (CNN, LSTM, and large fully connected network models, etc.). In the TOPS/Watt (performance per watt) power efficiency test, TPU's performance is 30 to 80 times better than conventional processors; and compared with the traditional GPU/CPU computing combination, TPU's processing speed is 15 to 30 times faster; more importantly, due to the use of TPU, the amount of code required for deep neural networks is also greatly reduced. In the era of artificial intelligence where deep learning technology is developing rapidly and data and computing power requirements are increasing rapidly, Google's alternative solution will reduce the burden on hardware on a large scale and further reduce the hardware cost of artificial intelligence.
(2) Huawei HiSilicon
As one of the leading companies in my country's chip field, Huawei HiSilicon released the world's first AI mobile chip Kirin 970 in 2017, taking the lead in occupying the commanding heights of AI chips and attracting widespread attention in the industry. Kirin 970 uses the industry's high-standard TSMC 10nm process, integrates 5.5 billion transistors, achieves a peak download rate of 1.2Gbps, innovatively integrates the NPU dedicated hardware processing unit, and designs the HiAI mobile computing architecture.
In September 2018, Huawei HiSilicon once again released a new generation of products, Kirin 980. Based on CPU, GPU, NPU, ISP and DDR, this product realizes a heterogeneous architecture with full system integration optimization, and has set six world firsts: the first use of the leading TSMC 7nm manufacturing process, the first dual NPU on a mobile chip, and the first commercial development based on the ARM Cortex-A76 CPU architecture. Among them, the Cambrian NPU uses a dual-core structure, and its image recognition speed is 120% faster than that of Kirin 970.
In addition, Huawei HiSilicon has the largest market share in the monitoring SOC chip field in the world, and its monitoring SOC with integrated AI local inference function will definitely occupy an important position in the market.
(3) Baidu
Baidu and hardware manufacturers jointly launched the DuerOS smart chip, which is Baidu's new exploration in the integration of artificial intelligence and hardware devices. DuerOS smart chips have low-cost chips and modules, which can be embedded in any hardware in the form of chips, and can be applied to more scenarios more quickly and widely. It can be seen that Baidu is using the combination of "algorithms + chips" to realize the industrialization of artificial intelligence.
In July 2018, Baidu released its first cloud AI chip "Kunlun", which was developed after more than 20 iterations based on Baidu's eight years of experience in AI accelerator R&D in CPU, GPU and FPGA in China's large-scale AI computing practice. Compared with Google TPU, which is good at floating-point calculations, Baidu's AI chip is better at mixed-precision calculations. In some scenarios, its computing performance is 2-3 times stronger, while power consumption is lower. It will be used in future fields such as autonomous driving and image recognition.
2. Traditional chip companies
As AI algorithms are gradually open-sourced and popularized, some mature chip design companies that have existed for many years have quickly absorbed and researched AI algorithms and launched terminal AI chips for specific application areas. These mature chip design companies have considerable advantages in cost control, chip definition, and customer channels.
Audio and video SOC chip series with integrated AI functions can be widely used in home appliances and consumer electronics markets such as set-top boxes, digital TVs, smart speakers, and tablet computers. They have many application areas and huge market space, and are one of the main battlefields of consumer electronics. All the giant companies of Chinese audio and video SOC chips have entered the market, and the competition will be very fierce in the future.
(1) Hangzhou Guoxin
Hangzhou Guoxin, a well-known set-top box SOC chip design company, launched a SOC-level AI chip with integrated NPU (neural network processor) in 2017 for the field of voice recognition. It deeply integrates algorithms, software and hardware according to the characteristics of artificial intelligence and the Internet of Things. It is a new voice interaction AI chip with high intelligence, low power consumption and full integration. It can help terminal products achieve local offline, low-power and mobile voice recognition, and is mainly aimed at hot areas such as smart speakers, smart TVs, and smart toys.
(2) Rockchip
As a digital audio and video, mobile multimedia chip developer, Rockchip launched its first AI chip in 2018 that uses a CPU+GPU+NPU hardware structure design. Its features include high hardware performance and strong platform compatibility, integrating Rockchip's many years of experience in machine vision and voice processing. In early 2019, the latest AI chip targeting the IoT field was released, supporting voice wake-up and recognition, face detection and recognition, etc. Currently, Rockchip's AI chips have been used in Himalaya smart speakers and Alibaba face payment products.
(3) Amlogic Semiconductor
Amlogic is a chip design company for OTT/IPTV set-top boxes, smart TVs, and smart homes. At present, Amlogic has proposed that it will integrate the innovative technology of artificial intelligence on the basis of consolidating its smart TV technology and market advantages, and actively develop a series of artificial intelligence TV chips with embedded neural network processors to move towards the intelligent interconnection ecosystem of all things. In terms of products, Amlogic has launched a 12nm ultra-high-performance six-core artificial intelligence display chip and a semi-universal terminal AI chip with a built-in NN (neural network) processor, which can be used in smart home fields such as smart cameras and smart speakers.
(4) Allwinner Technology
Allwinner Technology is a design company that focuses on intelligent terminal application processor SOC, high-performance analog devices and wireless interconnect chips. Recently, Allwinner Technology has integrated artificial intelligence technologies such as speech recognition and image recognition into multiple series of chip products, and has vision and speech algorithm acceleration modules.
(5) MediaTek
MediaTek is a world-renowned IC design company that focuses on wireless communications and digital multimedia technologies. At the end of 2018, MediaTek released the P70 chip with a built-in multi-core artificial intelligence processor. At the beginning of 2019, MediaTek introduced the AI dedicated core (APU) strategy in the mobile phone field into smart speakers and other smart hardware to support terminal-side AI solutions.
3. Startup chip companies
Some universities, research institutes and overseas returnee teams have founded a number of AI chip companies based on their accumulated AI algorithm and chip technology, and launched customized AI chips to meet the needs of certain specific application areas.
(1) Cambrian
Cambricon Technologies was originally a research group under the Institute of Computing Technology of the Chinese Academy of Sciences. It began to study neural network algorithms and chips as early as 2008, and began to publish research results in 2012. The company's founder and CEO, Professor Chen Tianshi, is an internationally renowned young scientist in the field of processor architecture and artificial intelligence. Cambricon's main products are core processor chips for various intelligent cloud servers, intelligent terminals, and intelligent robots.
In May 2018, Cambricon released its first cloud AI chip, MLU100. The chip uses the latest MLUv01 architecture and TSMC 16nm process, and can work in balanced mode (1Ghz) and high-performance mode (1.3GHz) with equivalent theoretical peak speeds of 128 trillion fixed-point operations and 166.4 trillion fixed-point operations, respectively, with power consumption of 80w/110w. At the same time, Cambricon also released the terminal AI chip 1M, which is also its third-generation machine learning dedicated chip. The comprehensive performance of 1M is ten times that of its predecessor.
(2) Horizon
Horizon Robotics was founded in 2015 by Yu Kai, the former head of Baidu's Deep Learning Research Institute. BPU (Brain Processing Unit) is an efficient AI processor architecture IP independently designed and developed by Horizon Robotics. It supports ARM/GPU/FPGA/ASIC and focuses on special fields such as autonomous driving and facial image recognition. Horizon Robotics' embedded AI solutions based on Gaussian architecture have begun to be applied in the three fields of intelligent driving, intelligent life, and public security. Horizon Robotics' first-generation BPU uses TSMC's 40nm process. Compared with traditional CPU/GPU, the energy efficiency can be improved by 2 to 3 orders of magnitude (about 100 to 1,000 times). It is currently in the pre-mass production stage.
(3) Bitmain
Founded in 2013, Bitmain is a company that focuses on the design and development of high-speed, low-power customized digital currency mining chips.
At the 2017 World Artificial Intelligence Conference, Bitmain released the Sophon BM1680, a custom chip for AI applications, the SC1 and SC1+ deep learning acceleration cards, and the SS1 intelligent video analysis server, officially entering the AI industry. In October 2018, Bitmain released the BM1880, a new generation of terminal AI chip, which has a performance improvement of more than 5 times over the previous generation. Also released were the SA3 intelligent server, the SE3 embedded AI mini machine, the 3D face recognition intelligent terminal, and the BM1880-based development board, AI module, computing power stick and other products, starting to move towards dedicated terminal AI chips.
(4) Canaan Technology
Founded in 2013, Canaan Technology is one of the earliest companies focusing on digital blockchain computing devices. It has launched the K210 series of artificial intelligence terminal chips that have both visual recognition and voice recognition. The chip has a high-speed convolutional neural network accelerator (KPU) and an audio processing accelerator (APU), which can be flexibly combined with basic information technologies related to IoT technology, software systems, cloud computing platforms, etc. It can be widely used in advertising/big data collection, security monitoring, logistics detection, unmanned stores, fatigue safety monitoring, power/power control, toys and robots, etc., and has successful application cases in smart homes, smart factories, face recognition and other fields.
(5) Xijing Technology
Founded in May 2015, Westwell Technology is a technology company that develops "brain-like AI chips + algorithms". Its chips use FPGA circuits to simulate neurons, and the finished product has 10 billion simulated neurons to realize the working mode of SNN. Its product Deepsouth competes with IBM's truenorth. Due to its special architecture, these chips have strong computing power and can be used in medical fields such as gene sequencing and simulation of brain discharges. At the same time, Westwell Technology also has a commercial chip with 50 million neurons, which can be used in portable medical devices due to its small size and low power consumption.
(6) Qiying Tailun
Chengdu Qiyingtailun Technology Co., Ltd. was established in November 2015. It is a company focusing on the design of artificial intelligence terminal chips and the development of supporting intelligent algorithm engines. In September 2016, it launched the world's first deep neural network intelligent speech recognition chip CI1006. The CI1006 chip integrates Qiyingtailun's proprietary brain neural network processing unit BNPU and also uses ARM's most advanced MCU core Cortex-M4F to form a dedicated SoC architecture. It has the advantages of high performance, low power consumption, high recognition rate, and low cost. It can support local voice detection, wake-up, and recognition of hundreds of offline command terms.
(7) ThinkForce
Founded in 2017, ThinkForce is a smart chip developer strategically invested by Yitu Technology. As one of the four major CV (computer vision) unicorns in China, Yitu has strong AI algorithm capabilities and face databases. In May 2019, Yitu held a press conference to launch the cloud-based deep learning reasoning customized SoC chip "Qiusuo" jointly developed with ThinkForce. "Qiusuo" adopts the Many Core architecture with independent intellectual property rights and is used on Yitu's cloud and edge servers to accelerate different operations in the visual field. It is suitable for a variety of visual reasoning tasks such as face recognition, vehicle detection, video structured analysis, and pedestrian re-identification. The mass production of AI chips marks that Yitu has completed the leap from algorithms to chips, forming vertical integration capabilities from software to hardware and a complete artificial intelligence software and hardware solution.
4. Algorithm Company
Some AI algorithm companies are struggling to find chips that fully meet algorithm requirements and hope to rely on their own capabilities to provide complete software and hardware solutions. Therefore, they have begun to develop fully customized AI chips based on their own needs.
With the rapid implementation of AI algorithms in the fields of visual recognition and speech recognition, related algorithm companies have realized that the "algorithm + chip + data" model can effectively achieve scale and reduce costs. In the field of speech recognition, since the chip complexity of terminal ASICs that support AI speech recognition algorithms is relatively low, some algorithm companies have developed AI chips dedicated to speech recognition. The main representatives are Unisound, AIS, etc.
Figure 3: Distribution of AI chip companies in my country
Source: Public information, compiled by JD Capital
Key Areas—AI Speech Recognition Market
Market size
Speech semantic recognition refers to the technology that enables computers to automatically understand human spoken language through speech signal processing and semantic recognition. The main steps of speech recognition are signal collection, noise reduction, and feature extraction and decoding. The extracted features are decoded by a speech model trained with speech big data in the background, and finally the speech is converted into text. Semantic recognition understands the meaning of human language expression through natural language analysis.
According to the research forecast of Research and Markets, the global intelligent voice market will continue to grow rapidly, and the market size will reach 19.17 billion US dollars by 2020. At present, the voice recognition application market is mainly in the fields of smart speakers and intelligent voice interactive home appliances. It can be foreseen that intelligent voice recognition will also be deeply applied in the fields of autonomous driving, education, and medical care. The ultimate development goal of voice semantic recognition is multilingual automatic translation technology and equipment. Once this ultimate goal becomes a reality, it will be possible to completely break the communication barriers between different languages, recreate the "Tower of Babylon", and have a profound impact on human society.
1. Smart speaker market
The smart speaker field is experiencing explosive growth. Currently, global Internet and mobile phone giants including Amazon, Google, Alibaba, Xiaomi, Baidu, JD.com, and Huawei have successively entered the smart speaker field and promoted it to the level of strategic products. On the one hand, as the entrance to smart homes, smart speakers are expected to drive the rapid growth of other hardware products; on the other hand, by using smart speakers as the entrance to home data, the giants are expected to expand other business models in the future.
In 2018, the global market shipment of smart speakers was about 86.2 million units, a year-on-year increase of 170%, far exceeding market expectations. In 2018, the cumulative shipment of smart speakers in China exceeded 20 million units. In the fourth quarter of 2018, it reached 8.6 million units. Internet giants such as Alibaba, Xiaomi and Baidu occupied the forefront of the market, with market shares of 31%, 29% and 28% respectively.
Table 2: Statistics on shipments of major smart voice speakers and chip solutions used (unit: 10,000 units)
Source: Company website, Zhidongxi, and Jiuding Investment
2. Smart voice interactive home appliance market
In addition to the rapidly growing field of smart speakers, major home appliance manufacturers are also actively integrating voice interaction functions. Voice interaction can not only add a novel and unique function to the product, but also become a home voice entrance and continuously accumulate user behavior information.
(1) Smart TV market
With the gradual popularization of smart TVs represented by Xiaomi TV, the way of watching TV has changed dramatically in the past three years. The younger generation of TV viewers has quickly completed the transition from passively watching live broadcasts to actively watching on demand and even searching for content. Voice interaction has brought great convenience to the content search function of smart TVs, which will strongly drive the growth of revenue from high-quality paid content. At present, Skyworth, Xiaomi, Baofeng, Haier, etc. have all launched smart voice recognition TVs.
In 2017, China's smart TV sales reached 47.365 million units, a year-on-year increase of 13.8%. In 2018, smart TV sales will exceed 50 million units. It is expected that voice interaction will quickly become a standard feature of smart TVs, becoming another huge market for voice interaction technology.
(2) Intelligent voice air conditioner
In 2017, the domestic sales volume of air conditioners reached 88.755 million units, a year-on-year increase of 46.8%. In recent years, the annual domestic sales volume of air conditioners has remained above 60 million units. Midea, Gree, Haier, AUX and Changhong have all launched voice-activated smart air conditioners. The market for air conditioners with intelligent voice interaction functions is huge.
(3) Other voice interaction markets
Since 2018, the shipment volume and growth rate of products such as children's storytelling machines (robots) and automatic clothes drying racks with voice recognition functions have exceeded expectations, bringing a large demand for intelligent voice recognition chips. According to the latest market research, the annual shipment volume of automatic clothes drying racks nationwide exceeds 30 million units, and the annual shipment volume of children's storytelling machines (robots) is expected to reach 40 million units. It is expected that the penetration rate of voice recognition functions will exceed 50% in the next three years.
Development Trend
1. The trend of integrating AI modules into terminal voice recognition chips is clear, but there are different strategies in terms of integration methods and functional positioning.
At present, AI chips used in terminal voice recognition are divided into general-purpose, semi-general-purpose and special-purpose types. General-purpose AI chips are similar to CPUs. AI algorithms are directly accelerated in the computing unit of the main control chip, which can ensure that the chip can adapt to the needs of different application scenarios. It is more flexible, but the cost and power consumption are relatively high, such as the MediaTek chip used in Tmall smart speakers. Semi-general-purpose AI chips adopt heterogeneous design, often in the form of CPU+NN module. The NN module is specifically used to accelerate the AI algorithm, and the CPU is used as a supplement, with the intention of achieving a compromise between flexibility, cost and power consumption, such as the Amlogic chip used in Xiaomi Xiaoai smart speakers. Special-purpose chips are AISC chips designed for voice recognition, which achieve lower costs and power consumption, but are less flexible. As voice recognition applications gradually mature and market demand gradually becomes clear, high-efficiency, low-power dedicated AI chips for specific scenarios will become mainstream products.
In addition, among the voice terminal products currently on the market, the complexity of the AI algorithms deployed varies for different application markets. Some only realize keyword wake-up in offline state, such as smart speakers; some realize lightweight voice semantic recognition such as keyword recognition and offline conversation, such as smart home appliances; some need to support full-function voice semantic recognition in offline state, such as in-vehicle scenarios. It can be inferred that due to the complexity and continuous evolution of AI algorithms, especially training algorithms, voice and voice recognition will still be dominated by cloud computing. But at the same time, with the evolution of voice algorithms and the iterative upgrades of terminal chips, terminal AI voice chips will deploy more AI algorithm acceleration modules to achieve faster response speeds, meet the needs of diversified scenarios such as in-vehicle, complement cloud training and reasoning, and improve user experience.
2. The participation of traditional professional chip design companies has accelerated the implementation and mass production of voice recognition chips.
Domestic AI chip companies such as Hangzhou Guoxin and Qiying Tailun have taken the lead in mass-producing AI chips for voice recognition terminals. Audio and video SOC chip giants such as MediaTek, Rockchip, Allwinner, and Amlogic have also gradually launched similar products. Professional chip design companies have cooperated with algorithm companies and, relying on their mature chip design, product definition, and cost control capabilities, have launched low-cost, low-power, offline wake-up and voice recognition AI chips, which are used in terminal products such as smart speakers and smart home appliances. It is expected that shipments will reach tens of millions in 2019 and will maintain rapid growth in the next 3-5 years.
3. Algorithm companies began to extend into the chip design stage, forming a pattern in which algorithm and chip companies are both complementary and competitive.
Among domestic speech recognition algorithm companies, iFlytek occupies a dominant position. At the same time, there are well-known AI start-up companies such as Unisound, Mobvoi, AISpeech, ROKID, and Yitu Technology, forming a situation of one super and many strong. The overall level of algorithms of each company is relatively close, and each has its own niche areas of expertise. In order to better implement algorithm solutions and reduce costs through scale, some voice algorithm companies have begun to develop their own AI chips and launch AI-specific chips that can accelerate their own algorithms, helping them to form landing applications in specific application fields faster. Some algorithm companies have entered the field of AI chips through various means such as joint ventures to establish chip companies and cooperating with traditional chip companies for development. Algorithm and solution companies can develop chips to more accurately grasp their own functional requirements, but their competitiveness in multiple links such as product definition, cost control, R&D cycle, and supply chain management remains to be tested by time.
(1) iFLYTEK
iFlytek is the only listed company in the field of speech recognition in China. It has full-link speech recognition technology, has established an open cloud platform, and uses AI algorithm technology and platform-level services to license voice capabilities to third parties. At present, the business has expanded from government and education to medical, automotive and consumer electronics. In the medical field, iFlytek cooperated with SenseTime to launch an artificial intelligence medical platform for image diagnosis and automatic consultation; in the automotive field, it fully promoted the original equipment market and replaced the market share of traditional giant Nuance; in the field of smart speakers, it established a joint venture company Dingdong with JD.com, but the cooperation was terminated in 2018 and Dingdong Speaker became a subsidiary of JD.com. iFlytek has a team dedicated to the research and development of AI chips, which has achieved phased results.
(2) Speech
Sibichen started out in the field of speech recognition in the education field and has strong local speech recognition technology. It is currently making every effort to enter the in-vehicle and home voice industry. In the field of smart speakers, it provides speech recognition algorithms and solutions for more than half of the domestic products, and its partners include Xiaomi Xiaoai, Tmall Genie, Huawei and NetEase; in the home field, it began to promote consumer electronics such as early childhood education storytelling machines in 2019. In 2018, Sibichen and SMIC jointly established a joint venture company, Shencong Intelligent, to launch a voice terminal AI chip, which deeply integrated Sibichen's algorithm and chip architecture to launch a low-power terminal AI voice chip, targeting the smart home field including speakers, TVs, white appliances, etc.
(3) Yunzhisheng
Unisound has full-link voice recognition technology, mainly focusing on IoT voice recognition, and is more prominent in the home and car fields. Its customers include Gree, Midea, Changhong, etc., and it also serves nearly 100 car voice recognition solutions and brands. At present, Unisound's business has expanded to the fields of medical care, justice, etc. Unisound is the first AI voice algorithm company to establish a chip design team. It mass-produced its first AI chip in 2018 and will mass-produce the second-generation AI chip for the IoT field in 2019, mainly for white goods.
(4) Mobvoi
Mobvoi's strategy is to integrate hardware and software, and gradually expand to car and home scenarios through smart speakers, smart watches, headphones and other wearable devices. It has in-depth cooperation with strategic investor Google and ranks among the top four in global smart watch sales. It has in-depth cooperation with strategic investor Volkswagen and has entered the car front-end market, launching smart rearview mirrors with voice input navigation, information point search, instant messaging and other functions. It has cooperated with Hangzhou Guoxin to launch the "Wenxin" AI voice module.
(5) ROKID
In 2017, ROKID launched a full-stack voice open platform in cooperation with Alibaba, providing the industry with a one-stop voice solution and developer platform, and opening up multiple technologies such as voice recognition, speech synthesis, semantic understanding, voiceprint recognition, microphone array, and signal analysis and processing to the entire industry. At the same time, ROKID also launched its own AI chip KAMINO in cooperation with Hangzhou Guoxin in 2018.
(6) Others
In addition, since speech recognition is relatively mature and first implemented in the field of artificial intelligence, Internet and algorithm companies have begun to enter this field. Baidu and Sogou mainly develop speech recognition based on the input method of computers and mobile phones, and have little accumulation in algorithms such as sound signal processing, noise control of multi-microphone arrays, and sound source identification required by IoT terminals. Yitu Technology, a leading company in computer vision recognition, released a voice product at the end of 2018, which performed well in the authoritative Mandarin Chinese data set test. As a startup company that independently develops full-chain artificial intelligence technologies such as voice interaction, image recognition, and visual navigation, Orion Star has built a chip development team of dozens of people and cooperated with Rockchip to launch the speech recognition AI chip OS1000RK, which is currently used in Himalaya speakers.
Table 3: Comparison of the top five speech recognition companies
Source: Public information of enterprises, compiled by Jiuding Investment
Industry Development Trends
1. Currently, AI voice chips are in the early stages of development, and many mature chip design companies, algorithm and solution companies, and start-up chip companies are launching AI voice chips one after another.
Currently, AI voice chip companies can be roughly divided into four categories: one is the chip design teams of Internet and communication giants; two is mature chip design companies that have existed for many years; three is newly established AI chip startup teams/companies; and four is algorithm and solution companies that have expanded into AI chips.
Among them, Internet and communication giants have certain advantages in algorithms and data; mature chip design companies have advantages in cost control, chip definition, supply chain management, and customer channels; start-up chip companies have advantages in cutting-edge chip architecture and customization; algorithm and solution companies have advantages in integrating chip downstream algorithms and application solutions. If algorithms and application solutions continue to evolve rapidly, Internet giants and algorithm companies will have certain advantages; after market demand stabilizes, mature chip design companies will gradually gain advantages.
2. In the short and medium term, terminal AI chips will develop rapidly driven by downstream demand.
First, artificial intelligence has been commercialized in many scenarios, and most of them require localized inference operations, so terminal AI chips are indispensable and in great demand. Especially in the fields of video surveillance, smart home, and driverless voice interaction, terminal AI chips that support local inference functions will be the key factor in realizing the implementation of artificial intelligence.
Second, terminal AI chips are deeply customized according to specific application scenarios with clear requirements. Chip companies can innovate, compress and optimize algorithm implementation at the design architecture level to achieve efficient and lightweight artificial intelligence. The algorithm and chip development technology is less difficult than cloud chips.
3. Recently, voice recognition has become an important area for the implementation of terminal AI chips, forming a pattern in which algorithm and chip companies are both complementary and competitive.
Thanks to the joint efforts of traditional chip design companies and algorithm and solution companies, terminal AI chips with speech recognition AI algorithm acceleration functions have gradually matured, especially with the participation of traditional chip companies, which have launched a number of low-cost, low-power AI voice chips. Although there are different strategies in the integration methods and functional positioning of various speech recognition terminal AI chips, the trend of integrating AI speech recognition functions into terminal chips is already very clear. Whether it can better complement the cloud, achieve stronger AI algorithm acceleration capabilities, and achieve a balance in performance, cost, power consumption and other aspects will test the comprehensive strength of terminal AI chip manufacturers, and it is also the most critical competitive embodiment of companies in the industry in the future.
4. In the long run, cloud-based ASIC AI chips have great development potential and are a strategic battleground for artificial intelligence giants.
Since cloud training algorithms are very complex and still in the process of continuous development, GPU/FPGA and other chips with high flexibility will still dominate the market in the short term. Cloud ASIC AI chips will usher in explosive growth only after the AI algorithms are basically stable and the era of large-scale AI arrives.
In the future, as artificial intelligence technology continues to develop and mature, data processing capabilities will become the most important production factor. Developing cloud AI chips with powerful training and reasoning capabilities is a key goal for major artificial intelligence companies. Therefore, in the long run, cloud ASIC-shaped AI chips will become a strategic battleground for giants in the era of artificial intelligence. Since cloud AI chips are closely related to information security, this field will also become the focus of attention of governments around the world.
*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.
Today is the 1944th issue of content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Recommended Reading
Semiconductor Industry Observation
" The first vertical media in semiconductor industry "
Real-time professional original depth
Scan the QR code , reply to the keywords below, and read more
Science and Technology Innovation Board|OLED|Open Source|RF|5G|Gallium Nitride|Exhibition|VCSEL
Reply
Submit your article
and read "How to become a member of "Semiconductor Industry Observer""
Reply Search and you can easily find other articles that interest you!