How to Design Programmable Baseband Devices for Multiple Wireless Standards
Two important buzzwords in 2008 - mobility and convergence - are impacting the wireless industry in a variety of ways.
Customers need to be connected wherever they are and enjoy the fastest possible connection speeds. From the perspective of customers to the developers of the multiple emerging 4G standards that meet their requirements, it is difficult to evaluate the perfect product design and which wireless standard protocol to implement because a choice must be made that is suitable for all parts of the entire industry chain.
Although the drivers for 4G regarding space-time diversity, space division multiplexing, beamforming, CDMA and OFDMA, HSPA, LTE, WiMAX, and IMT-advanced can coexist, as can 802.11g/b/n, the differences between the different standards and protocols for transceivers are becoming greater.
At the same time, countries have erected barriers and are fighting to make it the final protocol in their regions through different digital broadcasting standards - DVB-T/H, ISDB-T, DMB-T, T-DMB and DAB. As a result, it is inevitable that there will be multiple implementations, depending on the occasion in which the product is developed (see Figure 1).
Figure 1: As shown in the center of the diagram, the convergence of different standards will result in their need to be used in many different applications.
Mobility and convergence require that users may travel through different communication environments on a daily basis and be able to switch between different protocols in order to maintain a connection with the media server - in other words, be able to support multi-mode.
Multimode requirements
In this way, the baseband processor is required to support several different modes and have the ability to switch between them. At the semiconductor device level, this means that manufacturers can achieve a programmable solution that is ideally integrated into the main application processor, creating a new value-added device path that goes far beyond the current scope of communication equipment.
Typical baseband processing solutions have addressed the challenges of addressing the requirements of multiple standard modems by simply extending the capabilities to better handle the additional data processing.
The major flaw in this design is simply trying to handle the additional data traffic without paying attention to how this additional data flows through the system - or worse, how to actually program for it.
However, with the evolution of each generation of communication standards, baseband processing has become more and more complex, and more and more standards need to be supported on a baseband device. It is no longer feasible to simply expand the data processing business of the device to design. Doing so will increase power consumption and affect battery life.
Traditional programmable baseband solutions only focus on improving data processing, but ignore the important impact of the solution's data throughput and programmability.
Because the true bottlenecks of the design are never identified, such systems will end up being far more complex than necessary. This complexity ultimately reduces battery life, which may seem imperceptible when the device is used mobile (although users with 3G phones know this is the current reality); the 4G standard is even more complex, and if nothing changes, the situation will only get worse.
New Path
Is it possible to use different approaches to address the challenges of multi-mode and programmability? The answer is yes, if all three key design points, data processing, data flow and programming efficiency can be addressed by a single architecture to provide flexibility while optimizing power consumption and minimizing implementation time and cost.
A new processing architecture has been developed by a Swedish company, Coresonic AB, to provide a programmable baseband solution that overcomes the limitations of traditional DSP architectures.
In this way, it focuses only on data processing issues and caters to the cost and power consumption requirements of handheld devices. Data is transmitted via IP, which can be integrated with other companies' equipment to provide attractive value-added products.
The new architecture, called Single Instruction Multiple Task (SIMT), can achieve the performance of a very long instruction word architecture, but with lower control overhead, and lower program and memory usage. Its instruction set is optimized for baseband processing tasks, which can significantly reduce firmware code size, even for complex standards.
Carefully chosen operations that are not suitable for software implementation can be accommodated by hardware acceleration in the architecture, allowing very efficient execution while still ensuring sufficient flexibility so that the hardware can be reused between different standards.
An innovative interconnect scheme and memory architecture accommodates a high degree of parallelism and efficient communication between processor cores, memory, accelerator chips, and I/O interfaces, as well as minimized data memory requirements and efficient memory access.
Programmable solutions need to find a compromise between the flexibility and performance of each function to achieve the desired effect. The modem requires a high degree of flexibility in baseband processing, but FEC (forward error correction) and digital front-end processing are usually more suitable for accelerator modules with lower flexibility. In order to obtain the high computing power required in baseband processing, very long instruction word (VLIW) and single instruction multiple data (SIMD) architectures are required.
The drawback of VLIW-based architectures is their inefficient power application, since wide instructions need to be fetched at every clock cycle. On the other hand, pure SIMD-based DSPs lack the possibility to perform different parallel operations, resulting in low utilization of the data path.
SMIT Architecture
The SIMT architecture utilizes the characteristics of baseband algorithms, reduces control overhead compared to baseband processors based on VLIW/SIMD architecture, and enhances memory utilization.
The processor architecture uses vector instructions to operate on large data sets in SIMD execution units. The key is to execute only one instruction per clock cycle, but allow several operations to be executed in parallel, as vector instructions can continue to run for several clock cycles on the SIMD unit.
This approach results in a degree of parallelism equivalent to that of a VLIW processor, but without the overhead of requiring a large control channel. The modem consists largely of operations performed on large vector-type data, resulting in a processor with high utilization and low overhead execution units.
For example, when the CMAC (complex multiplier-accumulator) performs one layer of the FFT, the integer data path can run operating system tasks; when the accelerated Viterbi decoder is executed in parallel at maximum throughput rate, the CALU (complex arithmetic logic unit) can complete the extraction of pilots.
To fully exploit the SIMT architecture, several key components are required: efficient vector execution units, a matching memory mechanism, a parallel memory addressing system, and a control core capable of managing multiple threads.
The SIMT architecture uses multiple complex SIMD execution clusters, such as 4-way complex multipliers and adders (MACs) and 4-way complex arithmetic logic units (ALUs). Each SIMD cluster can process a task independently of other execution units.
In order to coordinate the operations of each vector, a distributed memory is used. The system's memory is divided into several memory segments with independent address generation units, which together with the on-chip network improves the power efficiency of the memory part.
The on-chip network is implemented as a constrained four-way switch, directly under software control. This allows software tools to use static scheduling algorithms, so no arbiter is required and performance is fully predictable.
The processor is controlled by a RISC core, which contains instruction dispatch logic and multi-context support. The RISC core executes all control flow functions and integer instructions.
The SIMT processor manages all complex processing functions between the ADC/DAC and FEC units. The processor adds two SIMD units, a 4-way CMAC and a 4-way CALU, to the RISC core and a digital front-end accelerator.
A SIMT processor has been implemented in a lab environment. The developed chip contains a total of 1.5M bits of memory, which is divided into 43k words for complex memory, 4k words for integer memory, and 2k words for program memory. The program memory utilization is very high when a single vector instruction performs a calculation, such as performing a complex dot multiplication of a vector, or a complete FFT operation layer.
The architecture is suitable for implementing the complete DVB-T/H protocol in a typical 2k word program memory space and less than 8k word complete WiMAX protocol stack.
Programmability
Programmability enables hardware reuse not only between different wireless communication standards, but also between different parts of the processing flow. Through hardware reuse, programmable solutions reduce silicon area compared to hard-wired solutions, and even smaller than hard-wired solutions that only need to implement a single standard. Smaller silicon also leads to lower power consumption, because of reduced leakage and power consumption of on-chip communication.
Processors developed for mobile WiMAX and DVB-T/H using the SIMT architecture described in this article have been implemented in a complete receiver. The result is that - compared to leading edge hardware solutions, the SIMT-based processor for running 31.67Mb/s DVB-T services is estimated to use 18% less silicon area and 21% less power, with typical differences ranging between 50% and 70% in size when compared to programmable solutions.
Through algorithm mapping, scheduling algorithms, and simulation and testing on actual hardware, WiMAX support can be achieved. Compared with other cutting-edge solutions, the SIMT architecture-based solution is proven to have more efficient area and power utilization.
In addition to low-power physical material design processes with the help of modern synthesis techniques and back-end tools, low power consumption is achieved through structural-level design rather than by using special low-power processes (devices).
The SIMT architecture reduces control overhead by using vector instructions and a decentralized memory system to enhance data and control locality. Memory access power consumption is reduced by using only small single-port memories and reducing the number of memory accesses.
Without any optimization and modern power control techniques, a fully programmable DVB-T/H baseband processor prototype was implemented in the laboratory on an 11 square millimeter, 0.12 micron CMOS chip, which includes 1.5M bits of single-port memory and 200k logic gates.
The DVB-T/H baseband prototype consumes 70 milliwatts of power when carrying the maximum data rate of 31.67Mb/s and running at 70MHz. The work done on the prototype shows that this architecture is stronger than previous non-programmable DVB-T/H solutions in terms of size and power consumption, even ignoring the considerable optimization done in the structural design.
SIMT is put into practice
The architecture and diagrams described by SIMT have been part of laboratory work in the past. Now they are available in full hardware solutions that are being used in wireless semiconductor manufacturing to integrate full WiMAX baseband functionality into WiMAX personal portable devices using Coresonic's LeoCore process (see Figure 2).
Figure 2: Coresonic's complete WiMAX solution for personal portable devices.
From the RF interface up to the MAC layer interface running in the CPU.
The device supports Mobile WiMAX 802.16e-2005, Mobile System Profile Version 1.4, which also supports other modes such as 802.16d and 802.16j. The device performs all tasks from ADC/DAC interface to FEC, including digital front-end signal conditioning, synchronization, MIMO channel estimation/compensation, error correction and convolutional coding.
All the building blocks in the figure implement a complete solution from the RF interface to the MAC layer running on the CPU; this intensive processing array is completed by hardware to minimize the load on the MAC CPU. In addition to the hardware, firmware is also provided to support a variety of different standards, further reducing development time and risk.
in conclusion
In summary, the new SIMT architecture described in this paper provides a method for integrating efficient complex baseband processors. As part of this method, there are a large number of core building blocks, around which additional accelerators, interfaces and memory blocks are supplemented to build the required solutions.
This architecture overcomes the challenges of data processing, data flow and ease of programmability to demonstrate a very refined 4G baseband solution. The result is a complete solution size as small as the high-speed instruction cache in other solutions, with only a very low clock rate, but providing higher power efficiency.
By using such a specially optimized architecture for multi-mode wireless baseband processing, programmable solutions will be able to support multiple wireless standards such as 4G, such as WiMAX, with power and space utilization equivalent to or better than hardware solutions.
SIMT-based processors can perform parallel processing in a single instruction stream - eliminating the need for multiple DSPs to support multiple standards - and can be combined with the designer's own unique product design to provide a high value-added component.
About the Author
Professor Dake Liu has 16 years of experience in university research and teaching, and another 6 years of experience in R&D in the Swedish industry. He was CTO and co-founder of Freeh DSP AB, and later Chief Scientist of VIA Technologies Sweden. He was previously a senior member of Ericsson Microelectronics and Ericsson UAB. He is also a professor of computer engineering at Linkping University.
Principal Systems Engineer and Co-founder of Coresonic AB. He studied at Linköping University in Sweden and obtained a Master's degree in Applied Physics and Electrical Engineering, and a PhD in Multi-standard Baseband Processor Design. His research interests include high-speed wireless mobile connectivity, radio technology and baseband processor design; he has 3 US patents (2 pending) and is a co-author of Radio Design in Nanometer Technologies and Hbook of WiMAX.
Previous article:Development of electric automatic door control system for rail vehicles
Next article:Water temperature automatic control system
- Popular Resources
- Popular amplifiers
- Embedded Vision Development with INT8 Optimization on Xilinx Devices
- Xilinx Zynq All Programmable SoC: The smartest choice for implementing smarter vision systems
- 9 Reasons Why the Xilinx Zynq-7000 All Programmable SoC Platform is the Smartest Solution
- STC8 series MCU development guide: analysis and application of processors, programming and operating systems
- High signal-to-noise ratio MEMS microphone drives artificial intelligence interaction
- Advantages of using a differential-to-single-ended RF amplifier in a transmit signal chain design
- ON Semiconductor CEO Appears at Munich Electronica Show and Launches Treo Platform
- ON Semiconductor Launches Industry-Leading Analog and Mixed-Signal Platform
- Analog Devices ADAQ7767-1 μModule DAQ Solution for Rapid Development of Precision Data Acquisition Systems Now Available at Mouser
- Domestic high-precision, high-speed ADC chips are on the rise
- Microcontrollers that combine Hi-Fi, intelligence and USB multi-channel features – ushering in a new era of digital audio
- Using capacitive PGA, Naxin Micro launches high-precision multi-channel 24/16-bit Δ-Σ ADC
- Fully Differential Amplifier Provides High Voltage, Low Noise Signals for Precision Data Acquisition Signal Chain
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- 【GD32F307E-START】+I2C driver problem
- Essential survival skills for power engineers - 20 classic analog circuits
- Please recommend a power IC with 12V input and 48V/1.2A output
- Two books about VIM
- Switching Power Supply Interest Group 04th Task
- Shanghai Hangxin ACM32F070 Development Board + Touch Function Evaluation Board Evaluation - Part 2 Capacitive Touch Slider Function Development
- AD acquisition signal processing issues
- C2000 TMS320F28379D SCID SCIB configuration and use
- Dear experts, what is the function of R3 and NET point on the fourth stage op amp in the ultrasonic receiving circuit?
- TI dsp28335 routine pwm explanation