Home > Other >Special Application Circuits > How to create image processing solutions using HLS capabilities

How to create image processing solutions using HLS capabilities

Source: InternetPublisher:宋元浩 Keywords: Image processing edge detection HLS Updated: 2024/08/06

This solution leverages HLS capabilities to create an image processing solution that implements edge detection (Sobel) in programmable logic.

introduce

High-Level Synthesis (HLS) allows us to work at a higher level of abstraction when developing FPGA applications, which has the potential to save time and reduce non-recurring costs if it is a commercial project.

An important application of HLS is image or signal processing, where we may have created a high-level model in C or C++, or we may want to use an open source industry standard framework such as OpenCV.

In this project we will investigate how to build a Sobel edge detection IP core using HLS and then include it in a Xilinx FPGA of our choice.

The device chosen can be a traditional FPGA, such as the Spartan Seven or Artix, or it can be implemented in the programmable logic of a heterogeneous SoC, such as the Zynq 7000 or Zynq MPSoC.

theory

Before we get into the application, I should first give a brief overview of how the Sobel algorithm works. The Sobel algorithm works by identifying edges in an image and emphasizing them so that they can be easily identified. Typically this will create a grayscale image where edges are identified as shades of gray/white.

Sobel edge detection works by detecting changes in the gradient of an image in both the horizontal and vertical directions. To do this, two convolution filters are applied to the original image and the results of these convolution filters are then combined to determine the magnitude of the gradient.

implement

If we were to implement this in an FPGA using the traditional VHDL/Verilog RTL approach, the development time would not be short. This is because we would need to create the row buffer for the convolution and then implement the amplitude calculation. We would also need to create a testbench to ensure that our code works as expected before moving forward with the implementation.

Fortunately, when we use HLS, we can really skip a lot of the heavy lifting and let Vivado HLS do the lower level Verilog/VHDL RTL implementation.

To work at this higher level of abstraction, we will use Vivado HLS and its HLS_OpenCV and HLS_Video libraries.

The first library, HLS_OpenCV, allows us to use the very popular OpenCV framework. The HLS Video library provides many image processing functions that can be accelerated into programmable logic.

Instead the helpful HLS Video Library includes what we need to create a Sobel IP core which includes:-

HLS::CvtColor - This will convert a color scheme between color and grayscale according to its configuration.

HLS::Gaussian - This will perform a Gaussian blur on the image to reduce the noise present in the image.

HLS::Sobel - Performs a Sobel convolution in either the vertical or horizontal direction depending on its configuration. We will need to use both of these implementations in our IP core.

HLS::AddWeighted - This allows us to use the results from the vertical and horizontal Sobel operators to perform a result magnitude calculation.

These are not all the HLS functions we will use as we need to use others. We need to include these additional functions to make it easier to use HLS optimizations and interface with Vivado designs.

interface

The best way to move image data inside programmable logic is to use AXI streams.

This allows the creation of high-performance image processing paths where elements can be easily added or created as needed.

There are several IP blocks in the Vivado IP library that implement conversion between video input and output and AXI streams, as well as other image processing functions such as mixers and color space converters.

Therefore, we want our Sobel IP core to be able to accept an AXI Stream input and generate its output in the same AXI Stream format. To do this, we use the following function that allows conversion between AXI Stream and the HLS::Mat format used by the HLS functions.

HLS::AXIvideo2Mat - Converts from an AXI stream to the HLS::Mat format for AXI stream input.

HLS::Mat2AXIvideo - Convert from HLS::Mat format to AXI Stream format, for AXI Stream output.

C Synthesis and Optimization

Unlike Verilog and VHDL designs, the high-level languages we use to describe designs are untimed. This means that when the HLS tool converts C to Verilog or VHDL, it must go through multiple stages to create the output RTL

Scheduling - Determining actions and the order in which they occur.

Binding - Assigning operations to available logical resources within a device.

Control Logic Extraction - Extract control logic and create control structures such as state machines to control the behavior of the module.

Since the HLS tool must make trade-offs between performance and logic resources when running synthesis, many rules will be followed during implementation. These can affect the performance of the generated IP core, such as loops (a common structure in HLS encoding) remaining rolling.

Of course, we may want to change the decisions that the HLS tool makes during C synthesis to get better performance. We can do this in our C using #pragmas, and we can use several.

For this implementation, we will use the Dataflow pragma to ensure that we can achieve the highest frame rate possible.

To be able to use this pragma, we need to ensure that the HLS synthesis tool performs the two Sobel operations in parallel. This will allow us to specify data flow optimizations during HLS C synthesis, thereby optimizing the flow of data through the function. In effect, data flow optimizations are coarse-grained pipelining.

If we perform one Sobel operation first and then the other in sequence, we will not be able to apply this optimization.

Therefore, we need to split the result of the Gaussian blur into two parallel paths and then recombine them in the AddWeighted stage. To do this, we use the function

HLS::Duplicate - This copies the input image into two separate output images, which we can process in parallel.

software

Knowing all this, we can write code for the Sobel IP core

#include "cvt_colour.hpp"
void image_filter(AXI_STREAM& INPUT_STREAM, AXI_STREAM& OUTPUT_STREAM)//, int rows, int cols)
{
#pragma HLS INTERFACE axis port=INPUT_STREAM #pragma
HLS INTERFACE axis port=OUTPUT_STREAM
RGB_IMAGE img_0(MAX_HEIGHT, MAX_W IDTH);
GRAY_IMAGE img_1(MAX_HEIGHT
, MAX_WIDTH);
GRAY_IMAGE img_2(MAX_HEIGHT, MAX_WIDTH);
GRAY_IMAGE img_2a(MAX_HEIGHT, MAX_WIDTH); GRAY_IMAGE img_2b(MAX_HEIGHT, MAX_WIDTH);
GRAY_IMAGE img_3(MAX_HEIGHT, MAX_WIDTH);
GRAY_IMAGE img_4(MAX_HEIGHT, MAX_WIDTH);
GRAY_IMAGE img_5
(MAX_HEIGHT, MAX_WIDTH); RGB_IMAGE img_6(MAX_HEIGHT, MAX_WIDTH);
;
#pragma HLS dataflow
hls::AXIvideo2Mat(INPUT_STREAM, img_0);
hls::CvtColor

Of course, we want to be able to run C Simulation and Co Simulation at the same time, so we need a test bench that we can use to test the algorithm.

When we run C Simulation, we can see the results for the test input image as follows.

With the C simulation and Co simulation results, we can export the core and add it to the Vivado hardware design.

However, before we do that, you might want to check your analysis, look in Vivado HLS and confirm that the two Sobel functions are running in parallel.

We can export the IP core using the Export RTL option in Vivado HLS, and if we wish we can further configure the IP core parameters

Implementing the core

After exporting the core, you will find a zip file in the directory <project>/solutionX/imp. This directory contains all the necessary information to add the newly created Sobel IP core to your Vivado design.

This file can be added to our Vivado IP repository and then included in a Vivado block diagram

With all this integrated, you can build your application and target to the development board of your choice.

For the demo video below, I used a Zybo Z7 and HDMI-in and HDMI-out to apply video to the Sobel IP core and display the results.

宋元浩

Latest Other Circuits

Popular Circuits

Popular Components