Non-numeric (NaN) is treated the same as infinity. The IEEE754 standard has five rounding modes, and C28x+FPU only supports two of them: --Truncation: All decimal places are discarded regardless of size. --Round to the nearest even number: In this mode, if the decimal place is less than 5, it is discarded, and if it is greater than 5, it is carried. If the decimal place is 5, it is rounded to the nearest even number. Table 2 shows the effect of different rounding modes on the data. The C28x+FPU compiler configures the microprocessor to round to even mode by default [1]. Table 2: Examples of different rounding modes Mode Mode /b ... null, center]12.5[/td][/tr] [tr][td]Round to even
[/td][td]+12.0
[/td][td]+12.0
[/td][td]12.0
[/td][td]12.0
[/td][/tr] [tr][td]Round to nearestRound away from 0
[/td][td]+12.0
[/td][td]+13.0
[/td][td]12.0
[/td][td]13.0
[/td][/tr] [tr][td]truncated
[/td][td]+11.0
[/td][td]+12.0
[/td][td]11.0
[/td][td]12.0
[/td][/tr] [tr][td]Round up
[/td][td]+12.0
[/td][td]+13.0
[/td][td]11.0
[/td][td]12.0
[/td][/tr] [tr][td]Round down
[/td][td]+11.0
[/td][td]+12.0
[/td][td]12.0
[/td][td]13.0
[/td][/tr] [/table] 2. Floating-point C2000 chip calculation skills and points for attention The precision of floating-point numbers is determined by the mantissa. Most numbers will have errors when expressed as floating-point numbers. These errors are very small and can be ignored in most cases. However, after many calculations, the error may become too large to be acceptable. The following code defines a float type variable, which adds 11.7 20001 times on CPU1 and CLA1 of TI's latest Delfino chip F28379D.
float CLATMPDATA=0;
int index=20001;
while(index--)
{
CLATMPDATA=CLATMPDATA+11.7;
}
得到如下结果:
其中CLATMPDATA1是在CLA中将11.7加20001次得到的结果,CLATMPDATA2是在CPU中将11.7加20001次得到的结果。可以看出两者所得到的结果不同,并且都和正确结果234011.7有较大差距。
CPU和CLA运算结果的不同是由于其对浮点数的舍入模式的不同造成的,前文已经说过,C28x+FPU 编译器默认将CPU配置为就近舍入向偶舍入模式。而CLA不同,CLA默认为截断舍入模式[2]。在CLA的代码中,我们可以通过增加下述代码:
__asm(" MSETFLG RNDF32=1");//1为就近舍入向偶舍入,0为截断舍入
将CLA的舍入模式更改为就近舍入向偶舍入模式,然后再运行代码,可以得到和CPU同样的结果。
2. 为何CPU和CLA计算结果都有较大误差?如何解决?
11.7在用IEEE754格式的浮点数表示时为0x413b3333,其对应的实际值为11.69999980926513671875,可以看出误差很小,但是经过多次累加多次舍入后得到的结果误差较大,对此,我们可以将CLATMPDATA定义为long double型变量(64位),再次运行相同的代码,可以得到如下结果,可以看到误差很小可以忽略。
It should be pointed out that the existing C28x CPU only supports single-precision (32-bit) hardware floating-point operations. The operations of 64-bit double-precision floating-point numbers are all implemented through software, so the operation rate will be much slower. In addition, CLA does not support 64-bit numbers.
In this example, we can observe the assembly codes of float type variables and long double type variables as follows:
C code: CLATMPDATA2=CLATMPDATA2+11.7;
85)]If CLATMPDATA2 is a float type variable, the corresponding assembly code is:
00c08d: E80209D8 MOVIZ R0, #0x413b 1cycle
00c08f: E2AF0112 MOV32 R1H, @0x12, UNCF 1cycle
00c091: E8099998 MOVXI R0H, #0x3333 1cycle
00c093: E7100040 ADDF32 R0H, R0H, R1H 2cycle
[color= rgb(85, 85, 85)]00c095: 7700 NOP 1cycle
00c096: E2030012 MOV32 @0x12, R0 H 1cycle
If CLATMPDATA2 is a long double type variable, the corresponding assembly code is:
00c08b: 7680005A MOVL XAR6, #0x00005a 1cycle
00c08d: 8F00005A MOVL XAR4, #0x00005a 1cycle
00c08f: 8F40C26A MOVL XAR5, #0x00c26a 1cycle
00c091: FF69 SPM #0 1cycle
00c092: 7640C0C9 LCR FD$$ADD 4cycle (jump time)
+25cycle (25 cycles are required inside the FD$$ADD function)
It can be seen that it takes 7 cycles for the CPU to perform an addition on a float type number and 33 cycles for an addition on a long double type number.
3. Conclusion
1.The default rounding modes of the C2000 CPU and CLA are different, so different results may be obtained when calculating floating-point numbers. However, we can change the rounding mode through code to obtain the same result. 2. Single-precision floating-point numbers may have large errors after multiple calculations. The precision problem can be solved by defining the variable as a 64-bit long double type. 3. The C28x CPU only supports single-precision (32-bit) hardware floating-point operations. The operations of 64-bit double-precision floating-point numbers are all implemented through software, so the operation rate will be much slower. In the next generation of C2000 products, we will implement hardware support for 64-bit double-precision floating-point operations.