This time, we will conduct the first routine of the PYNQ framework. This routine will use two methods to achieve image scaling. The most classic algorithm for image scaling is bilinear interpolation, which involves a lot of calculations. So we will first implement the algorithm on the ARM core, then use the FPGA part to offload the calculation tasks, and finally compare the results of the two methods. First, use HLS to generate the hardware IP required for calculation acceleration , open the HLS command line window, and first make a simple modification to the tcl script, mainly modifying the chip model:
After the modification, run the ./build_ip.sh command to get the corresponding acceleration IP . Next, you need to build the Vivado project. Based on the Vivado project used for porting the PYNQ framework , first change the output clock frequency of the PL end to 100M :
Open two more AXI bus interfaces and set the data width to 128 :
Then according to the basic hardware connection block diagram is as follows:
Build the BD file according to the above block diagram , and the result is as follows:
To generate the BD file, you can also use the TCL command provided in Github , and then change the bit file, tcl file, and hwh file to the same name and upload them to the PYNQ development board. Then run the PS example first:
First, the basic functions implemented are described:
This function mainly converts a 640*360 image into a 320*180 image. The specific process is mainly implemented using the PIL library. You can see that the size of the input image is 640*360 when running :
The following shows the scaled image, whose size is 320*180 , and the best calculation time is 9.96ms :
The following example uses PL for acceleration. It uses the PL side to calculate the scaling process and uses the DMA module for data transmission, reducing the calculation load on the PS side. It is worth noting that the path of the bit file needs to be modified according to your own:
Perform operations such as IP mounting and image reading, and then use xlnk to open a suitable buffer:
Then write the appropriate control parameters into the IP , and then get the scaled image, and find that the result is correct:
The calculation on the PL side takes only 4.59ms , and the PL part achieves the effect of calculation acceleration:
The experiment is completed. Through this experiment, we understand the basic process of joint development of PS and PL , and become familiar with the basic idea of using the PYNQ framework to accelerate the algorithm.