DSP Basics--Fixed-point Decimal Operations

fish001

DSP Basics--Fixed-point Decimal Operations [Copy link]

In the DSP world, due to the limitation of DSP chips, fixed-point decimal operations are often used. The so-called fixed-point decimals are actually decimal operations performed with integers. The following will first introduce some theoretical knowledge of fixed-point decimals, and then introduce the methods of fixed-point decimal operations using C language as an example. In the TI C5000 DSP series, 16 bits are used as the smallest storage unit, so we use 16-bit integers for fixed-point decimal operations.

Let's start with integers. A 16-bit storage unit can represent up to 65,536 states, from 0x0000 to 0xffff. If it represents an unsigned integer in C language, it can represent from 0 to 65,535. If a negative number needs to be represented, the highest bit is the sign bit, and the remaining 15 bits can represent 32,768 states. It can be seen here that for computers or DSP chips, there is no special storage method for signs, but they are actually stored together with numbers. In order to use the same addition and subtraction rules for both unsigned and signed numbers, negative numbers in signed numbers are represented by the complement of positive numbers.

We all know that -1 + 1 = 0, and 0x0001 represents 1, so how can -1 be represented so that -1 + 1 = 0? The answer is simple: 0xffff. Now you can open the Windows calculator and calculate 0xffff+0x0001 in hexadecimal, and the result is 0x10000. So are 0x10000 and 0x0000 equivalent? We just said that 16 bits are used to express integers. The highest bit 1 is the 17th bit, which is the overflow bit. This bit is not stored in the operation register, so the result is the lower 16 bits, which is 0x0000. Now we know how to express negative numbers. For example: -100. First, we need to know the hexadecimal of 100. Convert it with a calculator and you will know that it is 0x0064, so -100 is 0x10000 - 0x0064, and the result is 0xff9c when calculated with a calculator. Another simple way to convert the sign is to negate and add one: write the number x in binary format, change each 0 to 1, 1 to 0, and finally add 1 to the result to get -x.
Okay, after reviewing the knowledge about integers, we will enter the fixed-point decimal operation. The so-called fixed-point decimal means that the position of the decimal point is fixed. We want to use integers to represent fixed-point decimals. Since the position of the decimal point is fixed, there is no need to store it (if the position of the decimal point is stored, it is a floating point number). Since the position of the decimal point is not stored, the computer certainly does not know the position of the decimal point, so the position of the decimal point is what we, the programmers, need to remember.
Let's take the decimal system as an example. If we can calculate 12+34=46, then of course we can also calculate 1.2+3.4 or 0.12+0.34. So the addition and subtraction of fixed-point decimals are the same as those of integers, and have nothing to do with the position of the decimal point. Multiplication is different. 12*34=408, while 1.2*3.4=4.08. Here, the decimal point of 1.2 is before the first digit, while the decimal point of 4.08 is before the second digit. The decimal point has moved. So when doing multiplication, do we need to adjust the position of the decimal point? ! But since we are doing fixed-point decimal operations, the position of the decimal point cannot be moved! ! How to solve this contradiction? That is to discard the lowest digit. In other words, 1.2*3.4=4.1, so we get the correct result of fixed-point operations. So when doing fixed-point decimal operations, we need to remember not only the position of the decimal point, but also the number of significant digits to express the fixed-point decimal. In the above example, the number of significant digits is 2, and there is one digit after the decimal point.
Now let's go into binary. Our fixed-point decimal is expressed in 16-bit binary, and the highest digit is the sign bit, so the number of significant digits is 15. There can be 0 - 15 digits after the decimal point. We call the number with n digits after the decimal point Qn. For example, the number with 12 digits after the decimal point is called a fixed-point number in Q12 format, and Q0 is what we call an integer.
The maximum value of a positive number in Q12 is 0 111. 1111111111111. The first 0 is the sign bit, and the numbers after it are all 1. So what is the decimal number? It is very easy to calculate, which is 0x7fff / 2^12 = 7.999755859375. The value expressed by a fixed-point number in Qn format is its integer value divided by 2^n. In the computer, integers are still used for calculations. When we imagine it as the actual value to be expressed, we perform this calculation.
Conversely, when an actual value x to be expressed is converted into a fixed-point number of Qn type, it is x*2^n. For example, the Q12 fixed-point decimal of 0.2 is: 0.2*2^12 = 819.2. Since this number is stored as an integer, it is 819, or 0x0333. Because the decimal part is discarded, 0x0333 is not the exact 0.2. In fact, it is 819/2^12 = 0.199951171875.
Let's summarize it with a mathematical expression:
x represents the actual number (*a floating point number), and q represents its Qn fixed-point decimal (an integer).
q = (int) (x * 2^n)
x = (float)q/2^nFrom
the above formula, we can quickly derive the +-*/ algorithm for fixed-point decimals:
Assume that the values expressed by q1, q2, and q3 are x1, x2, and x3 respectively
q3 = q1 + q2 If x3 = x1 + x2
q3 = q1 - q2 If x3 = x1 - x2
q3 = q1 * q2 / 2^nIf x3 = x1 * x2
q3 = q1 * 2^n / q2If x3 = x1 / x2We
can see that addition and subtraction are the same as general integer operations, but during multiplication and division, in order to prevent the decimal point of the result from moving, the value is moved.
The multiplication of fixed-point numbers written in C language is:
short q1,q2,q3;
....
q3=((long q1) * (long q2)) >> n;
Since / 2^n and * 2^n can be simply calculated by shifting, fixed-point operations are much faster than floating-point operations. Let's use an example to verify the above formula:
use Q12 to calculate 2.1 * 2.2, first convert 2.1 2.2 to Q12 fixed-point decimals:
2.1 * 2^12 = 8601.6 = 8602
2.2 * 2^12 = 9011.2 = 9011
(8602 * 9011) >> 12 = 18923
The actual value of 18923 is 18923/2^12 = 4.619873046875, which differs from the actual result 4.62 by 0.000126953125, which is accurate enough for general calculations.

////********************************************** ************/////

Many DSPs are fixed-point DSPs, which can process fixed-point data very quickly, but very slowly when processing floating-point data. You can use the Q format to convert floating-point data to fixed-point data, saving CPU time. In practical applications, floating-point operations often have both integer and decimal parts. Therefore, you need to choose an appropriate calibration format to better process the operation.

　　The Q format is expressed as: Qm.n, which means that the data uses m bits to represent the integer part and n bits to represent the decimal part. A total of m+n+1 bits are required to represent this data, and the extra bit is used as a matching bit. Assume that the decimal point is to the left of the nth bit (counting from right to left), so as to determine the precision of the decimal.

　　For example, Q15 means that the decimal part has 15 bits. A short data type occupies 2 bytes. The highest bit is the sign bit, and the following 15 bits are decimal places. Assuming that the decimal point is on the left of the 15th bit, the range represented is: -1<X<0.9999695.

　　To convert floating-point data to Q15, multiply the data by 2^15; to convert Q15 data to floating-point data, divide the data by 2^15.

　　For example: Assuming the data storage space is 2 bytes, 0.333×2^15=10911=0x2A9F, all operations of 0.333 can be represented by 0x2A9F. Similarly, 10911×2^(-15)=0.332977294921875. It can be seen that there is an error in the floating-point data after conversion through the Q format.

　　Example: Multiply two decimals, 0.333*0.414=0.137862

　　0.333*2^15=10911=0x2A9F, 0.414*2^15=13565=0x34FD

　　short a = 0x2A9F;

　　short b = 0x34FD;

　　short c = a * b >> 15; // After multiplying two Q15 format data, the Q30 format data is obtained. Therefore, in order to obtain the Q15 data result, it is necessary to right shift 15 bits.

　　The result of c is 0x11A4=0001000110100100. This data is also in Q15 format. Its decimal point is assumed to be on the left of the 15th digit, which is 0.001000110100100=0.1378173828125... which is not much different from the actual result 0.137862. Or 0x11A4 / 2^15 = 0.1378173828125