Providing higher performance solutions for AI applications at the network edge

Publisher:EEWorld资讯Latest update time:2019-09-12 Source: EEWORLDKeywords:Lattice Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Lattice Semiconductor White Paper

 

Edge AI applications such as presence detection and object counting are growing in popularity, but designers are increasingly demanding low-power and small-size edge AI solutions without compromising performance. The latest release of Lattice’s sensAI technology portfolio, available for ECP5 and iCE40 UltraPlus FPGAs, provides designers with the hardware platforms, IP, software tools, reference designs, and design services needed to implement low-power, high-performance AI at the edge.

 

Table of contents

 

  1. Summary

     

  2. Taking Advantage of FPGAs


  3. Major Updates

     

  4. sensAI Design Case

     

  5. in conclusion

 

Summary

 

The market for low-cost, high-performance edge solutions is becoming increasingly competitive. Leading market research firms predict that the edge solutions market will explode in the next six years. IHS predicts that by 2025, there will be more than 40 billion devices operating at the edge of the network, while market intelligence firm Tractica predicts that more than 2.5 billion edge devices will be shipped annually by then.

 

With the emergence of a new generation of network edge applications, designers are increasingly looking to develop solutions that combine low power and small size without sacrificing performance. Driving these new AI solutions are a growing number of network edge applications, such as presence detection for smart doorbells and security cameras in home control, object counting for inventory in retail applications, and object and presence detection in industrial applications. On the one hand, the market requires designers to develop solutions with higher performance than ever before. On the other hand, latency, bandwidth, privacy, power consumption, and cost issues limit them from relying on computing resources in the cloud to perform analysis.

 

At the same time, performance, power consumption and cost constraints vary by application. As the data demands of real-time online edge applications continue to drive the demand for cloud-based services, designers must address traditional power consumption, board area and cost issues. How can developers address the increasingly stringent power consumption (milliwatts) and small size (5 mm2 to 100 mm2) requirements of the system. The various performance requirements alone are difficult to meet.

 

Taking Advantage of FPGAs

 

Lattice FPGAs are uniquely positioned to meet the rapidly changing market demands of edge devices. One way designers can quickly provide more computing resources to edge devices without relying on the cloud is to use the inherent parallel processing capabilities in FPGAs to accelerate neural network performance. In addition, by using low-density, small-footprint packaged FPGAs optimized for low-power operation, designers can meet the stringent power and size constraints of new consumer and industrial applications. For example, Lattice's iCE40 UltraPlus™ and ECP5™ product families support the development of edge solutions with power consumption as low as 1 mW to 1 W and hardware platform sizes as small as 5.5 mm2 to 100 mm2. By combining ultra-low power, high performance, and high precision with comprehensive traditional interface support, these FPGAs provide edge device developers with the flexibility they need to meet changing design requirements.

 

 

Figure 1: Lattice Semiconductor’s low-power, small-footprint FPGAs provide the right combination of performance and features to support AI applications at the network edge.

 

To meet this need and accelerate development, Lattice has launched sensAI™, the industry's first technology collection, which provides designers with all the tools they need to develop low-power, high-performance network edge devices in smart homes, smart factories, smart cities, and smart cars. sensAI is designed to meet the growing demand for AI-enabled network edge devices, providing comprehensive hardware and software solutions for implementing low-power, always-on AI capabilities in smart devices running at the edge of the network. Launched in 2018, it is designed to seamlessly create new designs or update existing designs, and its low-power AI reasoning is optimized for these new application requirements.

 

What's in this comprehensive design ecosystem? First, Lattice's modular hardware platforms, such as the iCE40 UPduino 2.0 with HM01B0Shield development board and the ECP5-based Embedded Vision Development Kit (EVDK), provide a solid foundation for application development. UPduino can be used for AI designs that require only a few milliwatts, while EVDK supports applications that require higher power but typically operate below 1W.

 

Soft IP can be easily instantiated into FPGAs to accelerate the development of neural networks. Therefore, the sensAI development kit includes CNN accelerator IP, which enables designers to implement deep learning applications in iCE40 UltraPlus FPGAs. sensAI also provides complete CNN parameterizable accelerator IP cores that can be implemented in Lattice's ECP5 FPGAs. These IPs support variable quantization. This in turn enables designers to trade off between data accuracy and power consumption.

 

Lattice's collection of sensAI technologies allows designers to explore design options and trade-offs through an easy-to-use tool flow. Designers can use industry-standard frameworks such as Caffe, TensorFlow, and Keras for network training. The development environment also provides a neural network compiler that maps trained network models to fixed-point representations, supporting variable quantization of weights and activations. Designers can use the compiler to help analyze, simulate, and compile different types of networks for implementation on Lattice's accelerator IP cores without RTL experience. Designers can then use traditional FPGA design tools such as Lattice Radiant and Diamond to implement the entire FPGA design.

 

To speed up design implementation, sensAI provides more and more reference designs and demonstrations. Including facial recognition, gesture detection, keyword detection, presence detection, face tracking, object counting, and speed sign detection. Finally, design teams usually need certain expertise to complete the design. To meet this need, Lattice has established partnerships with many design service partners around the world to provide support for customers with insufficient AI/ML expertise.

 

 

Figure 2: Lattice senseAI is a complete hardware and software solution for developing AI applications at the edge of the network.

 

Major Updates

 

To meet the rapidly growing performance requirements of edge AI, Lattice released an update to sensAI in 2019, enhancing its performance and optimizing the design process. The updated sensAI has a 10x performance improvement over the previous version, which is facilitated by multiple optimizations, including optimized memory access through updates to CNN IP and neural network compilers, new 8-bit activation quantization, smart layer merging, and dual DSP engines.

 

In the latest version, the memory access sequence has been greatly optimized due to the updated neural network compiler, which supports 8-bit input data. Therefore, not only the access to external memory is reduced by half, but also higher-resolution images can be used as data input. With higher-resolution images, the solution is naturally more accurate.

 

To further accelerate performance, Lattice optimized the convolutional layers in the sensAI neural network, reducing the time spent on convolution calculations. Lattice doubled the number of convolution engines in the device, reducing convolution time by about 50%.

 

Lattice has increased sensAI performance without increasing power consumption, allowing designers to select lower gate count devices in the ECP5 FPGA portfolio. Optimized demo examples can help achieve this performance improvement. For example, a people detection demo using a CMOS image sensor optimized for low power operation provides a resolution of 64 x 64 x 3 using a VGG8 network. The system runs at 5 frames per second and consumes only 7 mW using an iCE40 UltraPlus FPGA. A second performance-optimized demo, for a people counting application, also uses a CMOS image sensor and provides a resolution of 128 x 128 x 3 using a VGG8 network. This demo runs at 30 frames per second and consumes 850 mW using an ECP5-85K FPGA.

 

 

Figure 3: These reference designs demonstrate the power and performance options available with sensAI

 

At the same time, sensAI brings a seamless design experience to users, supporting more neural network models and machine learning frameworks, thereby shortening the design cycle. New customizable reference designs simplify the development of common network edge solutions such as object counting and presence detection, while the design partner ecosystem is also expanding to provide users with important design services. With these, Lattice can provide developers with all the key tools they need to replicate or adapt their designs. For example, the following block diagram shows a comprehensive range of components provided by Lattice, including training models, training data sets, training scripts, updated neural network IP, and neural network compilers.

 

 

Figure 4: sensAI’s design process includes industry-leading machine learning frameworks, training data and scripts, neural network IP, and other resources necessary to design and train edge AI devices

 

Lattice has also expanded its support for machine learning frameworks to provide a seamless user experience. The initial version of sensAI supported Caffe and TensorFlow, and the latest version adds support for Keras, an open source neural network written in Python that can run on TensorFlow, Microsoft Cognition Toolkit, or Theano. Keras is designed to help engineers quickly implement deep neural networks, which can provide user-friendly

 

A good, modular and scalable environment to accelerate prototyping. Keras was originally used as an interface rather than an independent machine learning framework. Its high level of abstraction allows developers to accelerate the development of deep learning models.

 

To further simplify use, Lattice has updated the sensAI neural network compiler tool, which automatically selects the most accurate fractional digits when converting machine learning models to firmware files. The sensAI update also adds a hardware debugging tool that allows users to read and write at each layer of the neural network. After performing software simulation, engineers also need to know how their networks perform on actual hardware. Using this tool, engineers can see the results of hardware operation in just a few minutes.

 

In addition, the latest version of sensAI has been supported by a growing number of companies that provide Lattice with design services and product development expertise optimized for low-power, always-on edge devices. These companies help customers build edge AI devices by seamlessly updating existing designs or developing complete solutions for specific applications.

 

sensAI Design Case

 

This new higher performance solution from Lattice can be used in four different accelerator design cases listed below. In the first design case (Figure 5), designers use sensAI to build a standalone solution. This system architecture allows designers to develop an integrated, real-time solution with low latency and high security on a Lattice iCE40 UltraPlus or ECP5 FPGA, where FPGA resources can be used for system control. A typical application is the use of standalone sensors for people detection and counting.

 

 

Figure 5: sensAI as a standalone edge AI processing solution

 

Designers also used sensAI to develop two different types of pre-processing solutions. In the first case (Figure 6), designers used Lattice sensAI and a low-power iCE40 UltraPlus FPGA to pre-process the sensor data, thereby minimizing the cost of transmitting data to the SoC or cloud for analysis. For example, if used in a smart doorbell, sensAI will initially read the data from the image sensor. If it is determined not to be a person, such as a cat, the system will not wake up the SoC or connect to the cloud for further processing. Therefore, this approach can minimize data transmission costs and power consumption. If the pre-processing system determines that the object at the door is a person, it wakes up the SoC for further processing. This can greatly reduce the amount of data that the system needs to process while reducing power requirements, which is critical for real-time online edge applications.

 

 

Figure 6: In this case, sensAI pre-processes sensor data to determine whether the data needs to be sent to the SoC for further processing

 

In a second pre-processing application, designers can use the ECP5 FPGA to implement neural network acceleration (Figure 7). In this case, designers use the flexibility of the ECP5 IO to connect a variety of existing on-board devices (such as sensors) to a low-end MCU to achieve highly flexible system control.

 

 

Figure 7: The second system architecture also uses preprocessing. Designers can use ECP5 and sensAI to preprocess sensor data and enhance the overall performance of the neural network.

 

Designers can also use sensAI accelerators in post-processing systems (Figure 8). There are more and more design cases where companies have already developed proven MCU-based solutions, but they want to add some kind of AI functionality without replacing components or redesigning. But in some cases, their MCU performance is relatively insufficient. A typical example is a smart industrial or smart home application, where image filtering is required before analysis. Designers can add another MCU here and then go through a time-consuming design verification process, or they can add an accelerator between the MCU and the data center for post-processing to minimize the amount of data sent to the cloud. This approach is particularly attractive to IoT device developers who want to add AI capabilities.

 


Figure 8: This MCU-based design is enhanced with sensAI to enable edge AI capabilities in existing designs

 

in conclusion

 

Obviously, the next few years will be a critical period for the development of the market for real-time online network edge smart devices.

 

As the complexity of the industry increases, designers will need tools that can support higher performance at low power. The latest version of Lattice’s sensAI technology, combined with ECP5 and iCE40 UltraPlus FPGAs, will provide designers with hardware platforms, IP, software tools, reference designs, and design services to help them defeat their competitors and quickly develop successful solutions.

Keywords:Lattice Reference address:Providing higher performance solutions for AI applications at the network edge

Previous article:Innodisk and Microsoft collaborate to launch InnoAGE SSD
Next article:Mouser now stocks Maxim safety devices for medical consumables

Recommended ReadingLatest update time:2024-11-15 07:50

Lattice: CrossLinkPlus is a highly integrated bridging solution launched for the Asia-Pacific market
Recently, at the CrossLinkPlus FPGA new product launch conference of Lattice Semiconductor, the new CrossLinkPlus product was introduced, which has integrated flash memory, a hard MIPI D-PHY, high-speed I/O for instantaneous panel display, and flexible on-chip programming features. In addition, Lattice also provides r
[Embedded]
Latest Internet of Things Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号