#AI挑战营末站# Preliminary implementation of handwritten digit recognition on LuckFox Pico Pro/Max

undo110

#AI挑战营末站# Preliminary implementation of handwritten digit recognition on LuckFox Pico Pro/Max [Copy link]

This post was last edited by undo110 on 2024-5-31 20:08

1. Light up the board

Refer to the official tutorial: Getting Started | LUCKFOX WIKI

main:

1) Install RK Driver Assistant

2) Image download and burning. Find the pro max version from the official Baidu disk and use the burning tool SocToolkit provided in the document. Note: You need to press and hold the boot key before inserting the USB before releasing it. Then select all the files under the corresponding version file to download.

3) SSH login. Plug in the USB and log in using the static IP. Pro Max has SSH enabled by default, and you can log in with the default account password and static IP. In PowerShell

ssh root@172.32.0.93

, enter the password luckfox to log in.

4) Turn on the camera. The installation document provides a schematic diagram for installation. Note: The plastic at the socket of the cable strap needs to be gently lifted, inserted into the strap and pressed tightly. Successful camera recognition will generate the rkipc.ini file in /userdata/ of the board.

Then it is time to configure the RNDIS virtual interface and use VLC media player to view the camera.

The official documentation is very detailed and there are no strange problems.

2. Deploy SDK and compile

Refer to the official documentation for SDK deployment. At first, I wanted to experience docker deployment, but a series of strange problems occurred. I couldn't figure out where the problem was, so I chose Ubuntu 22.04 to compile the image.

1) After obtaining the official SDK:

git clone https://gitee.com/LuckfoxTECH/luckfox-pico.git

2) The Buildroot image supports both TF card boot and SPI NAND FLASH boot. Modify 'BoardConfig-EMMC-Ubuntu-RV1106_Luckfox_Pico_Pro_Max-IPC.mk' to:

export LF_TARGET_ROOTFS=buildroot

3) Install the cross-compilation toolchain: Go to tools/linux/toolchain/arm-rockchip830-linux-uclibcgnueabihf and find the file env_install_toolchain.sh. You can open the terminal in this file and enter

source env_install_toolchain.sh

You can enter the following command and the following information will appear to confirm that it has been configured:

4) Start compiling the image:

cd luckfox-pico

#编译busybox/buildroot

./build.sh lunch

./build.sh

5) Handwritten digit recognition

Since I am a newbie, I just started to learn from the official case codes. Here I use the code of the master knv

链接已隐藏，如需查看请登录或者注册

, and compile it according to the readme to output the executable file luckfox_rtsp_opencv. Then deploy it on the board.

mkdir build
cd build
cmake ..
make && make install

3. Transfer files to the board

In this step, sftp is used to transfer the executable file, bin folder, and rknn file converted from the previous round of activity to the board.

1) First, log in via ssh root@172.32.0.93 and then enter exit

2) Log in with sftp again. You can now use sftp to log in. Directly use sftp root@172.32.0.93 and enter the password to log in.

3) Package the above luckfox_rtsp_opencv, bin folder, and the converted rknn into a project folder locally, put -r project to the board, use the ls command to check that it has been received, and then exit

4. Authorization start

Log in to the board through ssh and switch to the folder just transferred.

First disable:

RkLunch-stop.sh

Then start:

./luck_rtsp_opencv my_model.rknn

If the message "Permission denied" appears, authorization is required.

Use command:

chmod 777 luck_rtsp_opencv

Authorize and then start.

Use vlc to view the camera

You can see that the camera successfully acquires photos and performs preliminary recognition of handwritten numbers.

5. (To be completed) Subsequent planned improvement directions:

From the above test, we can see that when multiple numbers appear on the screen, the current code can only lock on one of the numbers and recognize it. So the subsequent plan is to realize the recognition of multiple numbers in one screen.

Due to the weak foundation, I can't start to modify it in a short time. Next, I will analyze and study the official case code and the improved code of the big guys, and then improve it after I understand it clearly.

Official case analysis:

链接已隐藏，如需查看请登录或者注册

The main.cc file is:

The following are some basic statements:

    // 定义坐标和图像宽高
    int sX, sY, eX, eY;
    int width = 2304;
    int height = 1296;
    // 初始化帧率文本
    char fps_text[16];
    float fps = 0;
    memset(fps_text, 0, 16);

The while loop should be the processing in the video, and it is also the code that needs to be studied and analyzed:

while (1) {	
    // 获取VPSS通道帧
    s32Ret = RK_MPI_VPSS_GetChnFrame(0, 0, &stVpssFrame, -1);
    if (s32Ret == RK_SUCCESS) {
        void *data = RK_MPI_MB_Handle2VirAddr(stVpssFrame.stVFrame.pMbBlk);

        // 使用OpenCV将数据转换为Mat对象进行处理
        cv::Mat frame(height, width, CV_8UC3, data);

        // 在帧上绘制帧率信息
        sprintf(fps_text, "fps = %.2f", fps);		
        cv::putText(frame, fps_text,
                    cv::Point(40, 40),
                    cv::FONT_HERSHEY_SIMPLEX, 1,
                    cv::Scalar(0, 255, 0), 2);

        // 将处理后的数据拷贝回原数据缓冲区
        memcpy(data, frame.data, width * height * 3);					
    }

    // 发送视频流
    RK_MPI_VENC_SendFrame(0, &stVpssFrame, -1);

    // 从编码器获取视频流
    s32Ret = RK_MPI_VENC_GetStream(0, &stFrame, -1);
    if (s32Ret == RK_SUCCESS) {
        // 如果RTSP服务器和会话有效，发送视频数据
        if (g_rtsplive && g_rtsp_session) {
            // 获取编码后的数据的虚拟地址
            void *pData = RK_MPI_MB_Handle2VirAddr(stFrame.pstPack->pMbBlk);

            // 通过RTSP发送视频数据
            rtsp_tx_video(g_rtsp_session, (uint8_t *)pData, stFrame.pstPack->u32Len,
                          stFrame.pstPack->u64PTS);

            // 处理RTSP事件
            rtsp_do_event(g_rtsplive);
        }

        // 获取当前时间
        RK_U64 nowUs = TEST_COMM_GetNowUs();

        // 计算帧率
        fps = (float)1000000 / (float)(nowUs - stVpssFrame.stVFrame.u64PTS);
    }

    // 释放VPSS通道帧
    s32Ret = RK_MPI_VPSS_ReleaseChnFrame(0, 0, &stVpssFrame);
    if (s32Ret != RK_SUCCESS) {
        RK_LOGE("RK_MPI_VI_ReleaseChnFrame fail %x", s32Ret);
    }

    // 释放编码器获取的视频流
    s32Ret = RK_MPI_VENC_ReleaseStream(0, &stFrame);
    if (s32Ret != RK_SUCCESS) {
        RK_LOGE("RK_MPI_VENC_ReleaseStream fail %x", s32Ret);
    }
}

(To be completed) Knv's example analysis:

(To be completed) Proposed improvement ideas:

Jacktang

Solve problems slowly, don't be too humble

LitchiCheng

"Recognize multiple numbers in one picture" can be seen in my post, #AI挑战营最终站#RV1106Use rknn to perform real-time recognition of MNIST multiple numbers - Embedded Systems - Electronic Engineering World - Forum (eeworld.com.cn)

#AI挑战营末站# Preliminary implementation of handwritten digit recognition on LuckFox Pico Pro/Max [Copy link]

Latest reply