Design method of high-speed graphics frame storage using DSP+FPGA architecture
[Copy link]
Frame memory is the data channel between the graphics processor and the display device. All the graphic data to be displayed is first stored in the frame memory and then sent out for display. Therefore, the design of frame memory is a key to the design of graphics display system. Traditionally, there are many storage devices that can be used to design frame memory, such as DRAM, VRAM, SDRAM and SRAM. DRAM, VRAM and SDRAM are dynamic memories with large capacity and low price, but slower than SRAM, and need to be refreshed regularly during use. When the graphics processor does not have an external dedicated refresh interface, it is necessary to design a refresh circuit, which brings inconvenience to system design. SRAM devices are high-speed and have simple interfaces, but they are expensive and have small capacity. In recent years, with the continuous increase in SRAM capacity and the continuous decline in price, it is becoming more and more common to use high-speed SRAM to design graphics frame memory in some graphics display systems that require high-speed real-time display. This article introduces the high-speed frame memory design method that uses dual SRAM frame memory alternating switching, which has been actually applied in the project. It introduces in detail the method of using FPGA to design frame memory controller, realize frame memory alternation, power-on clearing, and learn from the principle of movie mask to realize single-frame double scanning.
1. Introduction to Graphic Display System
Figure 1 is a block diagram of a special graphics display system. The graphics display system adopts DSP+FPGA architecture. The graphics processor adopts AD's ADSP21061 chip; AMLCD adopts Korry's KDM710 full-color LCD display module, which is a 5×5-inch, 600×600 resolution full-color LCD display module with 24-bit digital RGB input; the two frame memories A and B adopt IDT's 71V424L10V high-speed asynchronous static RAM (read and write speed is 10ns). The system adopts a dual-frame memory rotation operation method: when the DSP writes pixels to one of the frame memories, the frame memory controller composed of FPGA reads out the pixels in the other frame memory in sequence and sends them to the AMLCD for display; vice versa. The graphics display system receives the display information of the host through IDT's 71V04 dual-port RAM. The frame memory controller and video controller in Figure 1 are implemented by Xilinx's SpartanII chip XC2S50. The video controller generates some timing control signals required by the KDM710 display module: line synchronization signal/HSYNC, field synchronization signal/VSYNC, data enable signal DATA_EN and pixel clock signal DCLK, etc. The frame memory controller generates a 24-bit RGB color data signal, which cooperates with the timing control signal in the video controller to display a stable image on the LCD screen.
2. Design of frame memory controller
2.1 Bus switching module
Figure 2 is a block diagram of the bus switching module of the frame memory controller. The address bus is switched through a multiplexer (MUX), and all data buses are hung on the data bus of the SRAM through a tri-state gate. There are three data on the data bus of the frame memory SRAM: one is the data bus data of the DSP; one is the data bus data of the FPGA; and the other is the background register data used for system power-on clearing. When the system is just powered on, the frame memory stores random numbers, and the screen will display a random screen. The background data needs to be sent to the two frame memories. The switching of the bus is controlled by the body switching signal Sel and the power-on clearing signal Clear. When the frame memory controller is powered on, the background color data is written into the two frame memories through the power-on clearing timing. During the power-on clearing process, the Clear signal is high. When Clear is high, both address bus selectors select the FPGA bus, that is, the FPGA address bus points to the two frame memories, and the data buses of the two frame memories all point to the background data register, that is, the tri-state gates 1, 2, 3, and 4 are closed, and the tri-state gates 5 and 6 are open. After the power-on clear sequence is completed, the control of the frame memory bus is controlled by the body selection signal Sel. When the DSP writes to frame memory A, the bus generated by the FPGA reads frame memory B; vice versa. As shown in Figure 2, when Sel is high, the DSP address bus selects frame memory A, the tri-state gate 1 is opened, and the tri-state gates 3 and 5 are closed; the FPGA address bus selects frame memory B, and the corresponding data bus tri-state gate 4 is opened, and 2 and 6 are closed. The color data in the background register can be defined by the user.
The control module of the frame memory controller generates the body selection signal Sel and the power-on clearing timing signal Clear. The structural block diagram of the control module is shown in Figure 3. In the figure, /VSYNC is the field synchronization signal. This signal passes through a differential circuit to generate an enable pulse signal with a pixel clock cycle width to control the counting enable of the counter. The counter is a modulo-2 counter. The Sel signal is a quarter-frequency of the field synchronization signal /VSYNC. After two field synchronization signals appear, the frame memory is switched, that is, the order of using the two frame memories is: AABBAA... This control method is similar to the design concept of a movie shutter, which allows a picture to be repeated twice on the screen, so that a 50Hz field frequency can be obtained at a frame frequency of 25Hz, doubling the system video bandwidth. For example, when the field frequency is 50Hz, the graphics processor can have 40ms to process a frame of graphics data. Figure 4 is a frame memory control timing diagram. The generation process of the Clear signal is as follows: when the system is powered on, the RST signal is high for a period of time (system logic reset) and then becomes low. At the falling edge of RST, ClearA becomes high. At this time, the field synchronization low-level valid signal has not arrived, ClearB is high, Clear is high, and the system starts the screen clearing timing. When the screen clearing work for the two frame memories is completed, the field synchronization signal /VSYNC is valid. The signal will latch the "0" level output, ClearB is low, Clear is low, and the system starts to work under Sel control. From the control module block diagram, it can be seen that the Clear signal only becomes high when the power-on reset signal RST ends (falling edge). After a field cycle, the Clear signal will remain low, and the control will be handed over to the Sel body switching signal. The VHDL code of the control module and the corresponding timing simulation diagram are shown in Figure 5 (Modelsim5.5FSE simulator simulation).
Design method of high-speed graphics frame memory using DSP+FPGA architecture Design method of high-speed graphics frame memory using DSP+FPGA architecture Design method of high-speed graphics frame memory using DSP+FPGA architecture Entity sel_gen is
Port(clk : in std_logic;
Rst : in std_logic;
Vsync : in std_logic;
Sel :out std_logic;
Clear : out std_logic;
end sel_gen
architecture rtl_sel_gen of sel_gen is
signal clken : std_logic;
signal cleartemp : std_logic;
signal inputrega : std_logic;
signal inputregb : std_logic ;
signal qn : std_logic_vector(1 downto 0); signal seltemp : std_logic;
begin
process(rst,vsync)
begin
if rst'event and rst='0' then
cleartemp "='1'
end if;
if (vsync='0') then
cleartemp "='0';
end if;
end
process;
clear
qn
"= (
others = "
' 0'
)
;
elsif
clk'event and clk = '1'
then if
clken
='1' then
if
qn = 3 then
qn "= (others = "'0'
);
else
qn "=qn +1;
end if; end if;
end if;
seltemp
"=qn(1);
end process;
sel "= seltemp;
end rtl sel gen;
3. Timing analysis
In order for the high-speed frame memory to work properly, certain delay requirements must be met. AMLCD latches data at the falling edge of the pixel clock. From the rising edge of the pixel clock to the correct RGB graphic data appearing on the data bus of the AMLCD, the delay T must be less than 25ns (the pixel clock cycle is 50ns and the half cycle is 25ns) for the system to work properly, as shown in Figure 6. The DLL (Delay-Locked Loop) in the figure is a digital phase-locked loop built into the SpartanII chip. Clk_top (40MHz) is divided by two by the DLL to obtain a 20MHz pixel clock. The 20MHz clock is used as the system working clock to provide counting pulses for the FPGA address counter, and as a pixel clock directly to the AMLCD. As can be seen from Figure 6, the delay T includes the following delays: T1 is the delay required for Clk_top to change the address on the frame memory SRAM address bus (the delay of each signal on the bus is different, T1 is the maximum value); T2 is the delay required for the frame memory SRAM to change the address to the valid data appearing on the data bus; T3 is the delay required for FPGA to read the data on the frame memory data bus and output it to AMLCD; T4 is the delay required for Clk_top to generate pixel clock directly to AMLCD through DLL. It can be seen that the delay T=T1+T2+T3-T4. The frame memory controller in the system is implemented by Xilinx's SparatnII chip XC2S50-6, which is integrated by FPGA Express3.7 and laid out by Xilinx's ISE4.2I software. After analysis, the delay after wiring is: T1=10.994ns, T3=10.691ns, T4=7.784ns, T2 is determined by the time parameters of the IS61LV5128 chip, T2≤10ns, so T≤23.901ns<25ns, which meets the timing requirements of the system. The timing report obtained by general development tools is the delay in the worst case of the system. The delay in the actual system will be less than the data obtained during simulation.
The design method of high-speed graphics frame storage using DSP+FPGA architecture uses high-speed SRAM memory as the graphics frame storage and uses FPGA to design the frame storage controller, which can greatly reduce the size of the circuit board and increase the reliability and design flexibility of the system. The dual frame storage alternating switching and single frame double scanning technology are used to increase the system video bandwidth, improve the system real-time performance, and reduce graphic flicker. The use of VHDL language for FPGA design has the characteristics of simple method, easy readability and strong reusability. The high-speed graphics frame memory has been implemented using Xilinx's SpartanII series device XC2S50 and has been applied in a certain type of aircraft cockpit graphics display system.
|