Improving the performance of DSP-intensive applications using high-performance SRAM

Publisher:genius6Latest update time:2014-06-06 Source: 互联网Keywords:DSP  SDR  FPGA Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
  军事与国防应用极大地受益于数字信号处理器(DSP),其广泛应用于雷达、软件无线电(SDR)、灵巧弹药与目标探测系统、电子战应用、飞机成像以及众多其它应用。DSP借助其完美架构提供的精确处理能力可以显著提高性能。关键DSP功能包括实时信号处理、超高吞吐量与可重编程功能。本文介绍了如何采用高性能四倍数据速率(QDR)SRAM而使整体DSP系统性能至少提高两倍(与使用SDRAM的传统方法相比)的方法。

  Digital Signal Processing

  Digital signal processing encompasses methods for processing signals after converting them into digital form, such as radar processing. Radar systems basically generate pulses that can be fed through a directional antenna. These signals travel at the speed of light, and any object in their path reflects a small portion of the projected energy back to the radar receiving antenna. Calculating the difference between the transmitted and received signals tells you the distance and speed of the object.

  DSPs are fundamental to radar systems and are used for a variety of functions such as pulse compression, signal filtering, and pulse modulation. Without DSPs, radar systems cannot accurately detect objects at long distances. DSPs differ from general-purpose microprocessors in that they are suited for the types of fast mathematical calculations (multiplication and addition) most commonly used when designing filters such as FFT and FIR. In general, FFT filters are used for domain conversion – from time domain to frequency domain or vice versa, while FIR filters are used for signal separation and recovery. These types of filters are commonly found in radar designs.

  There are two main hardware approaches to implementing DSP: programmable DSP processors and field programmable gate arrays ( FPGAs ). In both approaches, the DSP architecture is well suited for DSP algorithms.

  DSP Processor

  如图1所示的TI多内核DSP等DSP处理器采用专用硬件计算每个周期的乘法运算。现代DSP处理器的指令集允许编程人员在单个指令中指定多个并行运算,一般在主算术运算进行的同时一次或多次从存储器提取数据。另外,为了显著改善每个时钟周期的DSP性能,DSP架构目前包含并行运行的附加乘法器与加法器,从而可以在单个指令中编码并行运算。


  FPGA  -  based DSP FPGAs  such as Xilinx Virtex, as shown in Figure 2, use dedicated DSP blocks to efficiently implement DSP algorithms. Each DSP block contains hardware-based dedicated functions such as multiplication, multiplication and accumulation, addition, shift, comparison, bit logic functions and pattern detection. A wider range of mathematical functions can be implemented by cascading multiple DSP blocks.

  DSP Memory Requirements

  Executing DSP functions every cycle requires the ability to efficiently fetch instructions and data from memory. Therefore, the key to maintaining DSP performance is high memory bandwidth. DSP processors and FPGA  DSP blocks have established internal cache memory architectures (L1/L2) to support multiple memory accesses per cycle. Using separate memory banks to store instructions and data, a super Harvard architecture can be implemented. With this arrangement, the processor can fetch instructions and data operands in parallel every cycle. In addition, memory accesses in DSP algorithms generally exhibit predictable patterns. For example, FIR filter coefficients are accessed in a sequential loop. For deeper external storage, a hardware-based external memory interface (EMIF) that supports various SDR AM memories (DDR2/3, RLDRAM) is generally used.

  To achieve a two-fold increase in DSP performance, a new innovative approach to external storage using QDR SRAM can be implemented.

  Quad Data Rate (QDR) Architecture

  SRAMs such as the Cypress QDR-IV SRAM shown in Figure 3 are high-performance memory devices that are carefully optimized for high throughput. This type of memory has multiple independent data ports with a double data rate (DDR) interface. Access to these data ports can be simultaneous and independent of each other. The address bus is shared and operates at single or double data rates depending on the configuration. The highest density product on the market today is 144Mb, and supports 18x or 36x configurations.

  The architectural features of the QDR-IV  SDR AM are highly advantageous for digital signal processing flows that require high throughput, low latency, and true random access.

  Comparison between the traditional method ( SDR AM) and the new method (QDR-IV)

  The overall setup of the test environment is illustrated in Figure 4. The highest data throughput of different memory types was compared using FPGA-based DSP functions.

The following table compares the key performance parameters of   QDR-IV SRAM and DDR3  SDR AM memory technologies.

Reference address of this article: http://www.eepw.com.cn/article/247534.htm

  Table 1 shows that QDR-IV can provide more than twice the bandwidth of DDR3  SDR AM when running at the same frequency. In addition, the dual independent ports of QDR-IV SRAM generate output signals while acquiring input signals, which can easily meet the data requirements of the real-time processing functions of DSP . Therefore, the bottleneck problem of transferring data to and extracting data from the memory can be alleviated.

  SAR radar perspective

  SAR radars that observe the Earth’s surface at high resolution require transposed memory accesses, where the range and azimuth directions are transposed for reconstruction processing. Efficient FFT and IFFT ( DSP ) operations between range and azimuth compression processing can achieve this. The architectural advantages of QDR SRAM can improve the performance of SAR radars by achieving fast and consistent memory access times. Figure 5 illustrates the transposition problem associated with SAR image reconstruction:

  When using traditional SDR AM memory, writing SAR image data (as shown in the figure) results in a discontinuous address space, which reduces processor performance (estimated to be about 5 times in this case). Since QDR-IV's independent read and write ports support parallel operations and random memory access, the impact on processing power can be reduced.

  QDR SRAM provides a high-performance alternative to traditional SDRAM for off-chip data storage in DSP -based applications. The density limitations of QDR SRAM can be eliminated by cascading multiple devices. This approach is ideal for applications that require higher random access throughput because faster memory access improves DSP performance.

Keywords:DSP  SDR  FPGA Reference address:Improving the performance of DSP-intensive applications using high-performance SRAM

Previous article:A 3G video helmet design based on ARM11 and DSP
Next article:A brief analysis of the development and relationship of the two major markets of DSP and FPGA

Recommended ReadingLatest update time:2024-11-22 20:19

Design of driving safety auxiliary recording system based on TI DM642 and OMAP5912 DSP
Research Motivation and Introduction With the advancement of industry, the issues of driving safety and vehicle anti-theft have been put before people all over the world. According to reports, more than 110,000 people die in car accidents in China every year. Most of the accidents are caused by human factors, and fa
[Automotive Electronics]
Design of driving safety auxiliary recording system based on TI DM642 and OMAP5912 DSP
Driver design of micro printer based on FPGA and VHDL
Abstract: In order to replace the traditional use of single-chip microcomputer to drive the micro printer, the hardware control circuit of the printer is designed using the FPGA chip EP3C25Q240C8N of Altera Company, and the working timing of the micro printer is correctly controlled. The software uses the hardware desc
[Embedded]
Driver design of micro printer based on FPGA and VHDL
CEVA Helps Customers Efficiently Implement Sensor Fusion with SensPro Sensor Hub DSP
It is well known that sensors work best when used together. This is especially true for simultaneous localization and mapping (SLAM). SLAM plays an important role in the AR/VR space, adjusting the scene based on the user's posture and avoiding collisions during use in applications such as drones or robots. The SLAM ma
[Mobile phone portable]
CEVA Helps Customers Efficiently Implement Sensor Fusion with SensPro Sensor Hub DSP
Design and implementation of voice-activated electronic notepad based on DSP
  After decades of development, speech recognition and speech coding and decoding technology have become increasingly mature and have entered the practical stage. Speech recognition technology has begun to be used in telephone inquiry services, smart toys, PDAs, home appliances, communications, industrial control, lan
[Embedded]
Design of a Signal Source for Circuit Board Detector Based on DDS
0 Introduction The circuit board detector of a certain type of missile test equipment mainly completes the fault detection of the circuit board of the test equipment. The detection system requires the excitation signal generation circuit to be small in size, flexible in configuration, high in accuracy and f
[Test Measurement]
Design of a Signal Source for Circuit Board Detector Based on DDS
Compact Quad-Output Buck Regulator Solution Accelerates Adoption of Digital Endoscopy
   1  History of Endoscope Development   Most historians agree that Bozzini’s Lichleiter was the first device similar to the endoscope we know today. Invented in the early 19th century, the device was clumsy, projected an image into the doctor’s eye using an angled mirror, and was illuminated by a single candle, wh
[Power Management]
Compact Quad-Output Buck Regulator Solution Accelerates Adoption of Digital Endoscopy
Realization of Track Frequency Shift Signal Demodulation Based on DSP
This paper adopts the single-chip DSP device TMS32F2812. Through the study of the track frequency shift signal demodulation algorithm, the designed system has the advantages of high integration, good real-time performance, strong anti-interference ability and high reliability. 1 Overall design of the system The
[Embedded]
Realization of Track Frequency Shift Signal Demodulation Based on DSP
Design of intelligent video surveillance image processing circuit module based on DSP
  The system uses TI TMS320C6211 chip to process the images captured by the camera and after A/D conversion. After the DSP compresses the image, it is uploaded to the host computer motherboard through TI's PCI2040 chip by the HPI port of the DSP and communicates with the PCI bus of the host computer. The key to the sy
[Embedded]
Design of intelligent video surveillance image processing circuit module based on DSP
Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号