Low-power design of embedded 64kb SRAM based on DBL structure-EEWORLD

Collect

The capacity of embedded memory and the area it occupies in the system chip are getting larger and larger. The dynamic power consumption caused by its operation has become an important part of the system chip power consumption. Therefore, it is necessary to seek effective low-power design technology to reduce the impact of embedded memory on the entire system. In order to reduce the power consumption of memory, people have adopted technologies such as word line segmentation, hierarchical word line decoding and word line pulse generation, which greatly reduce the dynamic power consumption of memory. Another technology that can effectively reduce the dynamic power consumption of memory is bit line segmentation (DBL). In response to system requirements, the author uses the DBL structure and a storage array block decoding structure to complete the design of a 64 kb embedded memory module.

Parameter correction and formula re-derivation

Principle of DBL structure

The DBL structure is to merge two or more SRAM storage cells to reduce the number of transistors connected to the bit line, thereby reducing the bit line capacitance and achieving the purpose of reducing the dynamic power consumption of the memory. Figure 1w shows a circuit schematic diagram of connecting four SRAM cells together and connecting them to the bit line through transmission tubes. Compared with the bit line structure of the general layout, the number of transmission tubes connected to the bit line in the DBL structure shown in Figure 1w is reduced by 3/4.

There are two key points in the DBL structure: First, determine the optimal relationship between the number of memory array rows N and the number of merged units M. The so-called optimal means that the dynamic power consumption of the memory after merging is minimized. For this relationship, the corresponding formula is given in the literature [1]:
pnor = (1 /M + 0.1) + 2 ×( (M + 1) / (N (ΔV /V ) ) ) , (1)
Mopt = ( (N /2) ×(ΔV /V ) ) 1 /2 , (2)
where ΔV represents the voltage swing on the bit line, and V represents the power supply voltage. Second, determine the width-to-length ratio of each tube after merging. The following discusses these two issues.

Correction of DBL power consumption formula

Formulas (1) and (2) are derived under the following assumptions: In SRAM, the capacitance of the bit line is mainly composed of the drain capacitance of the transfer transistor in the storage unit and the metal connection capacitance of the bit line, and the parasitic capacitance of the metal line is 10% of the total drain capacitance C of the tube connected to the bit line. Then the parasitic capacitances C1 and C2 in Figure 2 can be expressed as
C1 = CM /N, C2 = C /M + 0.1C. [page]

However, the above assumptions do not really reflect the composition of the bit line capacitance, because the composition of the bit line capacitance includes the source/drain capacitance CBS of the transmission transistor in the storage cell, the coupling capacitance CBB between the bit lines, the coupling capacitance CWW between the bit line and the horizontal word line, the coupling capacitance CBSS between the bit line and the ground line, the coupling capacitance CBDD between the bit line and the power line, the metal connection capacitance CW of the bit line, etc. With the development of deep submicron technology, the source/drain capacitance CBS of the transmission transistor accounts for only 60% to 70% of the total bit line capacitance, and the other capacitance components account for 30% to 40%. In this case, the formula design circuit will bring large errors. In addition, the rounding of C1 is too large, which also introduces a large error and must be corrected. The author re-derives the formula as follows.

Assuming that the number of rows of the storage array is N, the number of storage cells merged in the DBL structure is M, and in the general layout structure (N rows), the total capacitance of all the drains of the transmission tubes connected to the bit lines is C, and assuming that other parasitic capacitances on the bit lines are 30% of the total capacitance of this drain, the capacitances C1 and C2 in Figure 2 can be expressed as
C1 = C (1.3M + 1) /N, C2 = (C /M) + 0.3C.

Assuming that the sub-bit line is not precharged during read and write operations, and its voltage value can reach the power supply voltage, and ΔV is used to represent the voltage swing on the bit line, then the dynamic power consumption of the DBL memory in Figure 2 can be expressed as

p = f (M) = (C2 × ΔV × V + 2 × C1 × V2) × f = [(C/M + 0.3 × C) × ΔV × V + 2 × C ((1.3M + 1)/N) × V2] × f (3)

According to the power consumption expression of the standard memory cell pstan = (C × ΔV × V) × f, (4)
Normalizing equation (3) yields pnor = (1/M + 0.3) + 2 × ((1.3M + 1)/ (N × ΔV/V)), (5)
Therefore, the M value with the minimum power consumption Mopt = ((N/2.6) × (ΔV/V))1/2 can be obtained. (6)
If the number of rows in the storage array is N = 1 024, and the bit line voltage change rate ΔV /V = 0.11, then Mopt ≈ 6, pnor ≈ 0.164. However, if calculated according to formulas (1) and (2), Mopt ≈ 8, pnor ≈ 0.140. The following designs are based on the modified formulas (5) and (6). [page]

Selection of tube width-to-length ratio
In a 6-tube storage unit, in order to complete normal read and write operations, the width-to-length ratio of each tube must meet certain constraints. This constraint is usually characterized by the pull-up ratio PR and the cell ratio CR. For the storage unit shown in Figure 3@, CR = (WN 2 / LN 2 ) / (WN 4 /LN 4 ) PR = (WP1 / LP1 ) / (WN 3 /LN 3 ). In order to complete the normal read operation without "read flip", CR is required to be greater than 1.8 (VDD = 3.3V, Vt = 0.5V) [7], so N2 has better conductivity than N4. In order to complete the normal write operation, PR is required to be less than 1 (VDD = 3.3V, Vtp = 0.5V and μP / μN = 0.5), that is, N3 should have better conductivity than P1. In the DBL structure, if the W / L of each tube in the storage unit is the same as that in the general structure, it is obvious that due to the series equivalent relationship between N4 and N6 (N3 and N5), the CR condition is easier to meet, while the PR condition is more difficult to meet, making the write operation more difficult. Therefore, in order to complete the normal read and write operations, the width-to-length ratio of tubes N4 and N6 (N3 and N5) should be reasonably determined. The width-to-length ratio of N4 and N6 can be estimated by approximating N4 and N6 (N3 and N5) as series resistors, as shown in Figure 3w. For the convenience of analysis, it is assumed that the structures of N4 and N6 are the same. Obviously, in order to maintain the normal read and write functions of the original storage unit, the width-to-length ratio of N4 and N6 should be doubled, while the width-to-length ratio of other tubes remains unchanged.

DBL structure with block decoding

From the previous analysis, it can be seen that for storage arrays with very long bit lines, by adopting DBL technology and selecting a reasonable M value, the dynamic power consumption will be reduced. However, the above analysis does not take into account the influence of factors such as different sizes of tubes and different layout styles. In addition, in the DBL structure, since other control logic circuits are also attached, they also have power consumption. Therefore, the actual power consumption cannot be completely calculated according to formula (5). In order to further reduce the power consumption of the memory, the author designed a block decoding structure of the memory array based on the DBL structure. First, in order to make the layout shape meet the requirements, the 64kb SRAM is divided into 8 8kb sub-arrays, and the 8 8kb sub-arrays are selected after decoding using the address signals A1, A2, and A3. This not only meets the layout requirements of the layout, but also reduces the power consumption of the memory. The entire layout is shown in Figure 4v. The DBL structure with block decoding is mainly designed for each 8kb storage sub-array. As shown in Figure 4w, each 8kb sub-array consists of two storage array modules on the left and right. Its working principle is: using column address lines A0 and A0 to control the output of the row decoder, so that in any read/write cycle, only one of the left and right storage arrays is selected, so that the entire 64kb SRAM has 1/16 sub-arrays in an active state, thereby reducing the dynamic power consumption caused by word line charging and discharging. [page]

In Figure 4w, the specific structure of the control logic is shown in Figure 4x, and the structure of the sub-array sub DBLàmemroy arrayi (i = 0~7) is shown in Figure 4y. Each sub-array has 512 rows, that is, N = 512. According to formula (6), the number of storage cells after merging M = 4.

According to the DBL structure of block decoding, the chartered 0.35μm double-layer polycrystalline three-layer aluminum wiring n-well CMOS process is used to complete the design of the embedded 64kb SRAM module, with a layout area of 1. 4mm ×4. 7mm (the layout area of the general structure is 1.3mm ×4. 3mm). The Starsim simulation results show that the average current of the memory using the block decoding DBL structure is about 37mA, and the average current of the general structure memory is about 65mA.

Conclusion

The above discusses the low-power design of the embedded 64kb SRAM. By adopting the DBL structure and the memory array block decoding structure, the power consumption of the memory is reduced by 43%, while the area is only increased by 18%. The simulation results show that the minimum access cycle of both is about 15ns. Therefore, according to A T2 P (A is the area, T is the access cycle, and P is the power consumption), this low-power design method is feasible. With the increase in the capacity of embedded memory and the development of deep submicron technology, the static power consumption caused by subthreshold leakage current can no longer be ignored, and seeking effective low-power design technology is still a topic worth exploring.

Keywords：Memory Reference address：Low-power design of embedded 64kb SRAM based on DBL structure

Previous article：Design and implementation of HIRFL-CSR front-end control system based on WindowsCE
Next article：Optimizing embedded control in ROADMs

Recommended ReadingLatest update time:2024-11-16 15:31

Samsung will demonstrate GDDR7 memory at the 2024 IEEE ISSCC, with a rate of 37 Gb/s, leading the world

According to news on January 29, the 2024 IEEE International Solid-State Circuits Conference (ISSCC) will be held in San Francisco from February 18 to 22. It is the highest-level conference in the field of integrated circuit design recognized by the world's academic and corporate circles. Samsung will display For its

[Semiconductor design/manufacturing]

Due to the squeeze on HBM3/3E memory production capacity, SK Hynix DDR5 prices are reported to increase by 15~20%

On August 13, Wall Street News reported that SK Hynix has raised the price of its DDR5 DRAM chips by 15%-20%. Supply chain sources said that Hynix's DDR5 price increase was mainly due to the squeeze on HBM3/3E production capacity. In June this year, there was news that the price of DDR5 has room to increase by

[Semiconductor design/manufacturing]

Due to the squeeze on HBM3/3E memory production capacity, SK Hynix DDR5 prices are reported to increase by 15~20%

S3C2440-Bare Metal Edition-08 | Using S3C2440 to operate SDRAM (Configuring the memory controller)

1 Introduction When it comes to SDRAM, everyone thinks it is too difficult. It is even more difficult to program the control timing of SDRAM. Yes, that's right! I thought so a year ago. I found it very difficult to learn the timing of this section. I watched the video several times but didn't understand it. I didn't u

[Microcontroller]

S3C2440-Bare Metal Edition-08 | Using S3C2440 to operate SDRAM (Configuring the memory controller)

New iOS and iPadOS 15 betas let apps request access to more RAM

Starting later this fall, Apple will allow apps to access more device memory, or RAM, which will enable developers to improve the performance of their apps on iPhones and iPads. Currently, apps are limited to the amount of RAM they can use, regardless of the amount of memory available on the device. For example, eve

[Mobile phone portable]

Motorola One Vision Plus: 6.3-inch screen + 4GB memory

In May 2019, Motorola released the entry-level phone One Vision, and now the successor of the phone has appeared in Google's Android Enterprise Directory, with the device name "Motorola One Vision Plus". Foreign media speculate that this phone has entered the late development stage and may be officially release

[Mobile phone portable]

STM32 memory allocation details

1. KEIL compiled data code RO-data RW-data ZI-data Flash actually stores data 2. Memory Segments bss segment, data segment, text segment, heap and stack. 2.1, bss segment The bss segment usually refers to a memory area used to store uninitialized global variables in a program; BSS is the abbreviation of Bloc

[Microcontroller]

A new era of intelligent parking: Advantech ROM-2620 core module empowers innovative parking timer case

In today's era of Internet of Everything, many cities are integrating smart parking meters into urban infrastructure. This transformation not only reduces labor and operating expenses for parking services, but also simplifies the management of equipment. 01 Project background Smart parking meters are devices u

[Industrial Control]

A new era of intelligent parking: Advantech ROM-2620 core module empowers innovative parking timer case

Study on memory management based on stm32f103zet6

The main thing is to follow Brother Atom's code to have a preliminary understanding or learn about memory management. Especially for those of us who want to develop in the embedded direction, memory management should be an art. Today, we can allocate and recycle memory by slightly modifying the atomic code, so we star

[Microcontroller]

Popular Resources
Popular amplifiers