It has never been easier to start AI applications! Thanks to FPGAs like Xilinx Zynq UltraScale+ MPSoC, AI can now also be used offline or deployed and used at the edge. The Ruisu Yingke core board module combined with the Vitis AI development tool provides users with convenient tools for developing and deploying machine learning applications for real-time reasoning, so it is easy to integrate AI into applications. Image detection or classification, pattern or speech recognition have driven the upgrading of industries such as manufacturing, medical, automotive and financial services.
Quickly start AI-based FPGA applications
Artificial intelligence is taking over more and more applications and life scenarios, such as image detection and classification, translation and recommendation systems, etc. The number of applications based on machine learning technology is huge and is still growing. With the core board module of Ruisu Yingke combining FPGA and ARM processor, it is easier than ever to use AI offline and at the edge.
Artificial intelligence (AI) has a long history and has been recognized as a discipline since 1955. AI is the ability of computers to mimic human intelligence, learn from experience, adapt to new information, and perform human-like activities. Applications of AI include expert systems, natural language processing (NLP), speech recognition, and machine vision.
The resurgence of AI
After several waves of optimism and disappointment, there is a renewed and growing interest in AI. Thousands of AI startups have been founded over the past 15 years or so, and the pace is growing. There are several driving factors behind this: Perhaps the most important one is the massive computing power that is now available at an affordable price. Not only is the hardware faster, but now everyone has access to supercomputers in the cloud. This has democratized the hardware platforms needed to run AI, allowing startups to emerge in large numbers.
Figure 1: Simplified view of a feed-forward artificial neural network with 2 hidden layers
Artificial neural networks (Figure 1) now scale to tens to hundreds of hidden layer nodes (Figure 2). Even networks with 10,000 hidden layers have been achieved. This evolution is increasing the abstract power of neural networks and enabling new applications. Today, neural networks can be trained on tens of thousands of CPU or GPU cores, greatly accelerating the process of developing generalized learning models.
Figure 2: ImageNet recognition challenge winners show increasing numbers of hidden layers in new neural network architectures
Another reason for the increased interest in AI is the breakthroughs in machine learning in recent years. This has helped attract interest from tech investors and start-ups, further accelerating the development and improvement of AI.
How machines learn
An artificial neural network is a computing model inspired by the human brain. It consists of a simple network of interconnected processing units that can learn from experience by modifying their connections (Figure 1). So-called deep neural networks (DNN - neural networks with many hidden layers) currently provide the best solutions to many large computing problems.
The most widely used deep learning systems are convolutional neural networks (CNNs). These systems use a feed-forward network of artificial neurons to map input features to outputs, and they use a backfeed system to learn (i.e., train) and produce a set of weights to calibrate the CNN (back propagation, Figure 3).
Figure 3: Neural networks need to be trained to learn how to solve a problem or challenge
The most computationally intensive process in machine learning is training a neural network. For a state-of-the-art network, it can take days to weeks, requiring billions of floating point calculations and large amounts of training data (GByte to hundreds of GByte) until the network reaches the desired accuracy. Fortunately, this step is not time-constrained in most cases and can be moved to the cloud.
When the network is trained, it can be fed a new, unlabeled dataset and classify the data based on what it has previously learned. This step is called inference and is the actual goal of developing applications.
Tell me what you saw
Classification of the input can be done in the cloud or at the edge (mostly offline). While processing data through a neural network usually requires a dedicated accelerator (FPGA, GPU, DSP, or ASIC), additional tasks are best handled by a CPU, which can be programmed with conventional programming languages. This is where FPGAs with an integrated CPU, so-called systems on a chip (SoCs), come into play, especially at the edge. SoCs combine an inference accelerator (FPGA array) and a CPU in a single chip. The CPU runs the control algorithms and data flow management. At the same time, FPGAs offer many advantages over GPU- or ASIC-based solutions, including easy integration of multiple interfaces and sensors and the flexibility to adapt to new neural network architectures (Figure 4).
Figure 4: Comparison of different technologies for AI reasoning applications
The inherent reconfigurability of FPGAs also enables them to take advantage of evolving neural network topologies, newer sensor types and configurations, and newer software algorithms. Using SoCs can guarantee low and deterministic latency when needed, for example, for real-time object detection. At the same time, SoCs are also very energy-efficient. The main challenge in getting the best performance from FPGAs is to efficiently map floating-point models to fixed-point FPGA implementations without losing precision (Figure 5), which is where vendor tools come in.
Figure 5: The process of efficiently mapping a floating-point model to a fixed-point FPGA implementation is called compression
Choosing the right tools
There are many tools available today that can help lower the barrier to entry for implementing your first AI project. For example, the VitisAI development tools provide users with tools to develop and deploy machine learning applications for real-time reasoning on FPGAs. They support many common machine learning frameworks, such as Caffe and TensorFlow, with PyTorch support coming soon. They enable state-of-the-art neural networks to be effectively adapted to FPGAs for embedded AI applications (Figure 5).
Figure 6: Ruisu Yingke Mars XU3 core board module
Combined with a standard SoC (System on Module) such as the Mars XU3 from Ruisu Yingke (Figure 6) (based on the Xilinx Zynq UltraScale+ MPSoC), plugged into the Mars ST3 backplane, AI applications can be realized faster than ever before (Figure 7).
Figure 7: Industry-proven AI application solution based on Xilinx Zynq UltraScale+ MPSoC
To demonstrate the performance and fast time-to-market capabilities of this combination, Ruisu Yingke developed an AI-based image recognition system in just a few days. The images were taken with a standard USB camera connected to the Mars ST3 baseboard. For higher performance, the MIPI interface on the baseboard can be used.
The neural network classifies images in a low-latency manner and runs on the Mars XU3 core board module. The system supports popular neural networks such as ResNet-50 and DenseNet, which are used for image classification and real-time face detection respectively.
A single FPGA module can not only run neural network inference, but also handle many other tasks in parallel, such as communicating with the host PC and other peripherals. Moreover, controlling various highly dynamic drives at the same time is where FPGA technology plays its advantages. For example, adding the Ruisu Yingke Universal Drive Controller IP core to control a brushless DC motor or a stepper motor will be a breeze. It has never been easier to leverage the power of AI at the edge, so start your project now!
The end
Swisswin (Switzerland) is one of the world's leading companies in the FPGA field. Headquartered in Zurich, Switzerland, it has more than 16 years of experience in the FPGA field. In 2019, it established a Chinese branch in Shenzhen to provide localized support and services to Chinese customers.
Previous article:Hirain Technologies and Silexica collaborate to expand SLX FPGA tool channel
Next article:Use FPGAs to quickly build high-performance, energy-efficient edge AI applications
- Popular Resources
- Popular amplifiers
- Detailed explanation of intelligent car body perception system
- How to solve the problem that the servo drive is not enabled
- Why does the servo drive not power on?
- What point should I connect to when the servo is turned on?
- How to turn on the internal enable of Panasonic servo drive?
- What is the rigidity setting of Panasonic servo drive?
- How to change the inertia ratio of Panasonic servo drive
- What is the inertia ratio of the servo motor?
- Is it better for the motor to have a large or small moment of inertia?
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Wi-Fi 8 specification is on the way: 2.4/5/6GHz triple-band operation
- Wi-Fi 8 specification is on the way: 2.4/5/6GHz triple-band operation
- Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
- Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
- Three steps to govern hybrid multicloud environments
- Three steps to govern hybrid multicloud environments
- Microchip Accelerates Real-Time Edge AI Deployment with NVIDIA Holoscan Platform
- Microchip Accelerates Real-Time Edge AI Deployment with NVIDIA Holoscan Platform
- Melexis launches ultra-low power automotive contactless micro-power switch chip
- Melexis launches ultra-low power automotive contactless micro-power switch chip
- Be careful when shopping on June 18: China Consumers Association warns consumers that so-called "quantum products" are "pseudo-technology"
- Getting Started with the ST SensorTile.box Sensor Kit (3) Expert Mode Experience
- Problem 2 of 7S3P battery pack
- MicroPython Hands-on (19) - Buzzer and music on the control board
- Netizens are porting mpy to Nucleo-32
- LOTO arbitrary waveform generator SIG82 simulates the output of relay energization and disconnection signal waveforms for algorithm debugging
- Date in spring - let the flowers bloom
- What majors do electronic engineers choose in the college entrance examination?
- Who can help me make a simple counter model?
- How much is a person who understands hardware + PCB design + Linux + FPGA worth?