Comments on Apple's A11 Neural Network Engine: AI acceleration will become standard for high-end mobile phone chips
At its latest press conference, Apple announced the 10th anniversary version of iPhone X, which attracted a lot of attention. Among the many features of iPhone X, the use of facial recognition FaceID instead of the original fingerprint recognition TouchID for screen unlocking and identity authentication is undoubtedly one of the biggest highlights, which may become another promotion for Apple to improve mobile phone interaction.
FaceID uses artificial intelligence technology to extract features from 3D facial modeling, and uses these features in conjunction with algorithms to achieve face recognition. In the live demonstration, the FaceID facial recognition user experience was very smooth, and the hero behind the smooth experience was the artificial intelligence accelerator integrated on the A11 Bionic SoC, which Apple officially calls the "neural engine."
Let's first take a look at what an AI accelerator is. To understand the concept of an AI accelerator, let's first review GPUs. In the 1990s, with the rise of multimedia applications, especially 3D games, people found that the traditional CPU architecture was incapable of handling such applications that required high-speed graphics rendering. The reason was that the CPU placed a lot of control logic and cache units on the chip, leaving little chip area for the computing unit.
At this time, people designed 3D acceleration cards. Chip companies represented by 3dfx, Nvidia and ATI were very influential at the time. As time passed and the market changed, the concept of 3D acceleration cards gradually transitioned to GPUs for general graphics calculations and even general parallel computing. Today, the only independent graphics card manufacturers on the market are Nvidia and ATI.
AI accelerators are very similar to GPUs, except that the driving applications back then were multimedia and 3D games, while today’s driving applications are AI, including voice assistants, face recognition, object recognition, and so on. The basic algorithm of this wave of AI is neural networks, and neural networks make extensive use of matrix multiplication and convolution operations. Once again, people have discovered that the computing power of the CPU cannot support current AI operations, and although GPUs can also achieve high-speed AI operations, the power consumption is too high (Nvidia TX2 designed for mobile applications consumes as much as 10W of power). Therefore, in order to allow more mobile devices to use AI, AI accelerators came into being.
An AI accelerator is usually a dedicated hardware unit that can exist in the form of a separate chip or IP on a SoC. Since it is specifically designed for AI acceleration, it can achieve very high performance and consume very low power when processing such operations. Huawei's previously announced Kirin 970 has such an AI accelerator integrated into it, which shows that AI accelerators are increasingly entering the mobile chip market.
Huawei Kirin 970 also integrates an artificial intelligence acceleration module
Apple is naturally unwilling to lag behind in the major trend of artificial intelligence, and has long been making arrangements in the field of artificial intelligence acceleration.
Looking at the current market, the reason why Nvidia can occupy the leading position in artificial intelligence hardware is closely related to its open and easy-to-use CUDA interface, which facilitates programmers to use GPU to accelerate artificial intelligence. On the mobile side, how to fully use the GPU on the SoC to accelerate artificial intelligence has always been a headache for developers. Apple has previously announced two different levels of interfaces, Metal and Core ML, for developers to use to accelerate artificial intelligence applications on the iOS platform.
Another purpose of publishing the software interface is to accumulate experience for the development of dedicated hardware to achieve software and hardware coordinated optimization. In May of this year, there was news that Apple's dedicated artificial intelligence acceleration hardware called "neural network engine" had been basically completed. Today, Apple has actually publicly announced the neural network engine.
Apple announced the following information about the Neural Engine at the event:
Using a dual-core design. Due to the lack of other supporting information, what we can infer from this information is that Apple may allow several usage modes of the neural network engine, including fully shutting down, turning on only one core, and turning on both cores to meet the performance/power requirements in different situations.
The performance can reach 0.6TOPS. This performance can already handle the current mainstream neural network model calculations. It can be seen that the current processing performance of about 1TOPS will become the standard for artificial intelligence accelerators.
Real-time processing. Another problem that GPUs are criticized for when processing artificial intelligence operations is the large delay, because GPUs are often based on block data (batch) processing, so they are not suitable for mobile terminals that need to respond in real time. Apple's neural network engine emphasizes that it is real-time processing, which is obviously to distinguish it from GPUs and meet the needs of real-time applications on mobile terminals.
In addition, we can also guess that the neural computing engine is an IP module on the SoC, rather than a separate chip.
We can compare it with other chips that contain similar acceleration modules. Huawei's Kirin 970 contains an AI acceleration module similar to Apple's neural network engine, with a peak performance of up to 1.93TOPS (more than three times that of Apple's neural network engine), but the actual performance is not equal to the peak performance, and it also depends on the coordinated optimization of software and hardware. Qualcomm's Xiaolong series includes the Neural Processing Engine software SDK, which can help developers better utilize the GPU/CPU/DSP on Qualcomm chips to complete AI acceleration. It can be said that Qualcomm's solution is more conservative than Huawei and Apple (it previously launched the Zeroth AI hardware acceleration module but was later abandoned), but in the tide of AI, Qualcomm is expected to add related AI accelerators to its subsequent chips.
Although a lot of information was released at the press conference, there are still many unknowns that need time to clarify.
The most interesting question is probably where else the neural network engine can be used besides FaceID? Since FaceID does not require high real-time performance (a delay of less than 1 second is estimated to meet customer needs), it would be too extravagant to equip a 0.6TOPS accelerator just for this purpose, so the neural computing engine should also be used in other occasions. So, what are the other application occasions? Is it exclusively for native apps within Apple's operating system, or will it be open to third-party apps? All of this is still unknown, but Apple's ambition to build a neural network engine is definitely not limited to FaceID.
How many versions of A11 Bionic are there? If there is only one version that includes the Neural Engine, then the Neural Engine must be doing something else on the iPhone 8 without FaceID. If there are two versions of A11 (the iPhone X version with the Neural Engine and the iPhone 8 version without the Neural Engine), then it can only be said that Apple has too much money to design two versions of the chip at once!
Today is the 1395th issue of content shared by "Semiconductor Industry Observer" for you, welcome to follow.
R
eading
Recommended reading (click on the article title to read directly)
★ Understand Apple's new products such as iPhone X in one article
★ Growth like chicken blood! How Luxshare Precision topped the domestic connector market
FPGAs will be everywhere
Follow the WeChat public account Semiconductor Industry Observation , reply to the keyword in the background to get more content
Reply BYD , read "BYD's chip layout, Wang Chuanfu's ambition"
Reply Changdian Technology , see "From a small factory in Jiangyin to the top three in the world, Changdian Technology has been on a wild run"
Reply Filter , see more filter related content
Reply Full screen , see more full screen related content
Reply Artificial intelligence , read "Understanding the global AI chips from scratch: Detailed explanation of "xPU""
Reply Exhibition , see "2017 Latest Semiconductor Exhibition and Conference Calendar"
Reply Submit your article and read "How to become a member of "Semiconductor Industry Observer""
Reply Search and you can easily find other articles that interest you!
Moore invites you to join the elite WeChat group
Hello, thank you for your long-term attention and support to the semiconductor industry observation! In order to facilitate the communication among elite experts, we have set up some professional and WeChat communication groups. You are welcome to join. We will also invite more than 100 technical experts who have shared technology and industry in Moore Live App to join the group and communicate with everyone. How to join the group: Long press the QR code, add the group owner as a friend, fill in the information required to join the group, and pull you into the group. (WeChat limits the number of friends added to 300 per day, please wait patiently)
Regional Group:
Shanghai, Shenzhen, Beijing, Jiangsu, Zhejiang, Xi'an, Wuhan, Chengdu, Chongqing, Hefei, Xiamen, Jinhua, Dalian, Taiwan, Singapore, Japan, South Korea, the United States, Europe, Moore live learning group.
Professional Group:
Analog RF design, EDA.IP, digital chip design, analog mixed signal design, layout, digital PR.Verification, wafer manufacturing Fab, equipment EE, semiconductor materials, semiconductor equipment, packaging and testing, semiconductor investment, marketing, AE.FAE, embedded development, internship communication, procurement.IC agent, AI chip
Professional WeChat Group Rules:
1. Professional and efficient communication. It is recommended to change the group nickname when joining the group. The format is: company or school + position or major + Chinese or English. Please obey the management of the group owner. If you violate the rules multiple times, you will be asked to leave the communication group.
2. In principle, each person should join no more than 3 groups. The group owner will be responsible for synchronizing the discussion content in different groups. Since you have joined a group, please try to pin it to the top of the group and actively participate in the group discussion;
3. Group chats and discussions are limited to semiconductor professional content. Non-professional content is prohibited, especially health preservation, canvassing, WeChat business and other content. Advertising for your own company is limited to not causing disgust among group members;
Click to read the original text and join the Moore Elite
Featured Posts