C2000 floating point calculation notes - differences between CPU and CLA and error handling techniques

alan000345 · Published on 2018-12-11 09:58

C2000 floating point calculation notes - differences between CPU and CLA and error handling techniques [Copy link]

The C2000 microprocessor with C28x+FPU architecture adds some registers and instructions to the original C28x fixed-point CPU to support IEEE single-precision floating-point operations. The floating-point C2000 is also fully compatible with programs written on fixed-point microprocessors, and no changes are required to the programs. Floating-point processors have the following advantages over fixed-point processors:

Easier programming
Better performance, such as higher efficiency in division, square root, FFT and IIR filtering algorithms.
Stronger program robustness.

1. IEEE754 format floating point number

C28x+FPU's single-precision floating point number follows the IEEE754 format. It includes:

1-bit sign bit: 0 for positive, 1 for negative.
8-bit exponent
23-bit mantissa

31	30 23	22 0
Sign bit	Exponent	Mantissa

Table 1: IEEE single-precision floating-point numbers

[ tr][td=160]

0 or 1

Sign bit S	Expand code E	Mantissa M	Value
0	0	0	positive 0
1	0	0	negative0
0 or 1	0	non-0	Denormalized number (1)
[p=21, null , left]0	1-254	0x00000-0x7FFFF	Normal range positive number (2)
1	1-254	0x00000-0x7FFFF	Normal range negative number (2)
0	255	0	Positive infinity
1	255	0	negative infinity
255	Non-zero	Non-numeric value (NaN)

(1) The denormalized value is very small. The calculation formula is (-1)sx2(E-126)x0.M

(2)The formula for calculating the normal range value is (-1)sx2(E-127)x1.M

Normal range values fall within the range of ± ~1.7 x 10 -38 to ± ~3.4 x 10 +38. As can be seen from Table 1, the IEEE754 standard includes:

Standard data formats and special values, such as non-numeric (NaN) and infinity
Standard rounding modes and floating-point operations
Multi-platform support, including Texas Instruments C67x series chips.

C2000 has made some simplifications to the standard:

Status flags and comparison operations do not distinguish between positive and negative 0
Denormalized values are considered to be 0
Non-numeric (NaN) is treated the same as infinity. The IEEE754 standard has five rounding modes, and C28x+FPU only supports two of them: --Truncation: All decimal places are discarded regardless of size. --Round to the nearest even number: In this mode, if the decimal place is less than 5, it is discarded, and if it is greater than 5, it is carried. If the decimal place is 5, it is rounded to the nearest even number. Table 2 shows the effect of different rounding modes on the data. The C28x+FPU compiler configures the microprocessor to round to even mode by default [1]. Table 2: Examples of different rounding modes Mode Mode /b ... null, center]12.5
[/td][/tr] [tr][td]
Round to even
[/td][td]
+12.0
[/td][td]
+12.0
[/td][td]
12.0
[/td][td]
12.0
[/td][/tr] [tr][td]
Round to nearestRound away from 0
[/td][td]
+12.0
[/td][td]
+13.0
[/td][td]
12.0
[/td][td]
13.0
[/td][/tr] [tr][td]
truncated
[/td][td]
+11.0
[/td][td]
+12.0
[/td][td]
11.0
[/td][td]
12.0
[/td][/tr] [tr][td]
Round up
[/td][td]
+12.0
[/td][td]
+13.0
[/td][td]
11.0
[/td][td]
12.0
[/td][/tr] [tr][td]
Round down
[/td][td]
+11.0
[/td][td]
+12.0
[/td][td]
12.0
[/td][td]
13.0
[/td][/tr] [/table] 2. Floating-point C2000 chip calculation skills and points for attention The precision of floating-point numbers is determined by the mantissa. Most numbers will have errors when expressed as floating-point numbers. These errors are very small and can be ignored in most cases. However, after many calculations, the error may become too large to be acceptable.
The following code defines a float type variable, which adds 11.7 20001 times on CPU1 and CLA1 of TI's latest Delfino chip F28379D.
float CLATMPDATA=0;
int index=20001;
while(index--)
{
CLATMPDATA=CLATMPDATA+11.7;
}
得到如下结果：
其中CLATMPDATA1是在CLA中将11.7加20001次得到的结果，CLATMPDATA2是在CPU中将11.7加20001次得到的结果。可以看出两者所得到的结果不同，并且都和正确结果234011.7有较大差距。
- 为何CPU和CLA计算结果不同？
CPU和CLA运算结果的不同是由于其对浮点数的舍入模式的不同造成的，前文已经说过，C28x+FPU 编译器默认将CPU配置为就近舍入向偶舍入模式。而CLA不同，CLA默认为截断舍入模式[2]。在CLA的代码中，我们可以通过增加下述代码：
__asm(" MSETFLG RNDF32=1");//1为就近舍入向偶舍入，0为截断舍入
将CLA的舍入模式更改为就近舍入向偶舍入模式，然后再运行代码，可以得到和CPU同样的结果。

2. 为何CPU和CLA计算结果都有较大误差？如何解决？
11.7在用IEEE754格式的浮点数表示时为0x413b3333，其对应的实际值为11.69999980926513671875，可以看出误差很小，但是经过多次累加多次舍入后得到的结果误差较大，对此，我们可以将CLATMPDATA定义为long double型变量（64位），再次运行相同的代码，可以得到如下结果，可以看到误差很小可以忽略。
It should be pointed out that the existing C28x CPU only supports single-precision (32-bit) hardware floating-point operations. The operations of 64-bit double-precision floating-point numbers are all implemented through software, so the operation rate will be much slower. In addition, CLA does not support 64-bit numbers.
In this example, we can observe the assembly codes of float type variables and long double type variables as follows:
C code: CLATMPDATA2=CLATMPDATA2+11.7;
85)]If CLATMPDATA2 is a float type variable, the corresponding assembly code is:
00c08d: E80209D8 MOVIZ R0, #0x413b 1cycle
00c08f: E2AF0112 MOV32 R1H, @0x12, UNCF 1cycle
00c091: E8099998 MOVXI R0H, #0x3333 1cycle
00c093: E7100040 ADDF32 R0H, R0H, R1H 2cycle
[color= rgb(85, 85, 85)]00c095: 7700 NOP 1cycle
00c096: E2030012 MOV32 @0x12, R0 H 1cycle
If CLATMPDATA2 is a long double type variable, the corresponding assembly code is:
00c08b: 7680005A MOVL XAR6, #0x00005a 1cycle
00c08d: 8F00005A MOVL XAR4, #0x00005a 1cycle
00c08f: 8F40C26A MOVL XAR5, #0x00c26a 1cycle
00c091: FF69 SPM #0 1cycle
00c092: 7640C0C9 LCR FD$$ADD 4cycle (jump time)
+25cycle (25 cycles are required inside the FD$$ADD function)
It can be seen that it takes 7 cycles for the CPU to perform an addition on a float type number and 33 cycles for an addition on a long double type number.

3. Conclusion
1.The default rounding modes of the C2000 CPU and CLA are different, so different results may be obtained when calculating floating-point numbers. However, we can change the rounding mode through code to obtain the same result. 2. Single-precision floating-point numbers may have large errors after multiple calculations. The precision problem can be solved by defining the variable as a 64-bit long double type. 3. The C28x CPU only supports single-precision (32-bit) hardware floating-point operations. The operations of 64-bit double-precision floating-point numbers are all implemented through software, so the operation rate will be much slower. In the next generation of C2000 products, we will implement hardware support for 64-bit double-precision floating-point operations.

star_66666 · Published on 2018-12-11 23:11

It's an old movie.

alan000345 · Published on 2018-12-12 10:59

star_66666 posted on 2018-12-11 23:11 It's an old movie

The usage is actually the same. And this is in the classic

star_66666 · Published on 2018-12-12 13:22

tiankai001 · Published on 2018-12-16 22:42

The film is old, but the technology is not outdated.

C2000 floating point calculation notes - differences between CPU and CLA and error handling techniques [Copy link]

Latest reply

Comments

资源大师勋章