Anti-interference method of single chip microcomputer system software-EEWORLD

Collect

Introduction: While improving the anti-interference ability of hardware systems, software anti-interference is gaining more and more attention due to its flexible design, saving hardware resources and good reliability. The following takes the MCS-51 single-chip microcomputer system as an example to study the anti-interference method of microcomputer system software.

1 Research on software anti-interference methods

In engineering practice, the content of software anti-interference research is mainly: 1. Eliminating the noise of analog input signals (such as digital filtering technology); 2. Methods to get the program back on track when the program is running chaotically. This article proposes several effective software anti-interference methods for the latter.

1.1 Instruction Redundancy

The CPU fetches instructions by first fetching the opcode and then the operand. When the PC is disturbed and an error occurs, the program will deviate from the normal track and "fly around". When it fetches a two-byte instruction, if the instruction fetching moment falls on the operand, the operand will be mistaken as the opcode, and the program will fail. If it "flies" to a three-byte instruction, the probability of error is even greater.

Inserting some single-byte instructions artificially at key places, or rewriting valid single-byte instructions is called instruction redundancy. Usually, two or more bytes of NOP are inserted after two-byte instructions and three-byte instructions. In this way, even if the program flies to the operand, the existence of the no-operation instruction NOP prevents the subsequent instructions from being executed as operands, and the program is automatically put on the right track.

In addition, inserting two NOPs before instructions that play an important role in the system flow, such as RET, RETI, LCALL, LJMP, JC, etc., can also put the flying program back on track and ensure the execution of these important instructions.

1.2 Interception Technology

The so-called interception means to guide the flying program to the designated location and then handle the error. Usually, software traps are used to intercept flying programs. Therefore, the traps must be designed reasonably first, and then the traps must be placed in appropriate locations.

1.2.1 Design of software traps

When the flying program enters the non-program area, the redundant instructions will not work. Through the software trap, the flying program is intercepted and led to the specified location, and then the error is handled. The software trap is used to lead the captured flying program to the reset entry address 0000H. Usually, the following instructions are filled in the non-program area of EPROM as software traps:

NOP

LJMP 0000H

Its machine code is 0000020000.

1.2.2 Trap Arrangement

Usually, the EPROM space that is not used in the program is filled with 0000020000. The last line should be filled with 020000. When the flying program falls into this area, it can automatically enter the orbit. The free cells between the modules in the user program area can also be filled with trap instructions. When the interrupt used is opened due to interference, set a software trap in the corresponding interrupt service program to capture the wrong interrupt in time. For example, although an application system does not use external interrupt 1, the interrupt service program of external interrupt 1 can be in the following form:

NOP

RETI

The return instruction can be "RETI" or "LJMP 0000H". If the design of the fault diagnosis program and the system self-recovery program is reliable and complete, using "LJMP 0000H" as the return instruction can directly enter the fault diagnosis program, handle the fault as soon as possible and restore the program operation.

Considering the capacity of program memory, generally 2-3 software traps per 1K space are sufficient for effective interception.

1.3 Software “Watchdog” Technology

If an out-of-control program enters an "infinite loop", the "watchdog" technology is usually used to get the program out of the "infinite loop". By continuously detecting the program loop running time, if it is found that the program loop time exceeds the maximum loop running time, it is considered that the system is trapped in an "infinite loop" and error handling is required.

"Watchdog" technology can be implemented by hardware or software. In industrial applications, severe interference sometimes destroys the interrupt mode control word and turns off the interrupt. Then the system cannot "feed the dog" regularly, and the hardware watchdog circuit fails. Software watchdog can effectively solve this problem.

In actual application, the author adopts a ring interrupt monitoring system. Timer T0 monitors timer T1, timer T1 monitors the main program, and the main program monitors timer T0. The software "watchdog" using this ring structure has good anti-interference performance and greatly improves the system reliability. For the measurement and control system that needs to frequently use the T1 timer for serial communication, timer T1 cannot be interrupted, and can be monitored by serial port interrupt instead (if the MCS-52 series microcontroller is used, T2 can also be used instead of T1 for monitoring). The monitoring principle of this software "watchdog" is: set a running observation variable in the main program, T0 interrupt service program, and T1 interrupt service program, assuming that they are MWatch, T0Watch, and T1Watch. Each time the main program loops once, MWatch increases by 1. Similarly, T0 and T1 interrupt service programs are executed once, and T0Watch and T1Watch increase by 1. In the T0 interrupt service program, the change of T1Watch is detected to determine whether T1 is running normally. In the T1 interrupt service program, the change of MWatch is detected to determine whether the main program is running normally. In the main program, the change of T0Watch is detected to determine whether T0 is working normally. If it is detected that a certain observed variable changes abnormally, such as it should be added by 1 but is not added by 1, then go to the error handling program for troubleshooting. Of course, the maximum cycle of the main program, the timing cycle of timers T0 and T1 should be fully and reasonably considered. Due to space limitations, I will not go into details.

2. System Fault Handling and Self-Recovery Program Design

Resetting the microcontroller system due to interference or after power failure is an abnormal reset. Fault diagnosis should be performed and the system should be able to automatically restore to the state before the abnormal reset.

2.1 Identification of abnormal reset

The execution of the program always starts from 0000H. There are four possible reasons why the program starts from 0000H: 1. System power-on reset; 2. Software fault reset; 3. Watchdog timeout and hardware reset; 4. Task is in the process of execution and then power-on reset. Except for the first case, all of the four cases are abnormal resets and need to be identified.

2.1.1 Identification of hardware reset and software reset

Here, hardware reset refers to power-on reset and watchdog reset. Hardware reset has an impact on registers, such as PC=0000H, SP=07H, PSW=00H after reset. Software reset has no impact on SP and SPW. Therefore, for microcomputer measurement and control systems, when the program runs normally, set the SP address to be greater than 07H, or set the 5th user flag of PSW to 1 when the system runs normally. Then when the system is reset, it only needs to detect the PSW.5 flag or SP value to determine whether it is a hardware reset. Figure 1 is a program flow chart using PSW.5 as the power-on flag to distinguish between hardware and software resets.

In addition, since the state of the on-chip RAM is random during hardware reset, while the on-chip RAM can maintain the state before reset during software reset, one or two units in the chip can be selected as the power-on mark. Assume that 40H is used as the power-on mark, and the power-on mark word is 78H. If the content of the 40H unit is not equal to 78H after the system is reset, it is considered to be a hardware reset, otherwise it is considered to be a software reset and turn to error processing. If two units are used as power-on marks, the reliability of this judgment method is higher.

2.1.2 Identification of power-on reset and watchdog fault reset

Power-on reset and watchdog fault reset are both hardware resets, so in order to correctly identify them, it is generally necessary to use non-volatile RAM or EEROM. When the system is operating normally, set an observation unit that can be protected by power failure. When the system is operating normally, keep the observation unit at a normal value (set to AAH) in the interrupt service program of the timed dog feeding, and clear the unit in the main program. Since the observation unit can be protected by power failure, it can be determined whether the watchdog is reset by detecting whether the unit is at a normal value when the system is powered on.

2.1.3 Identification of Normal Power-On Reset and Abnormal Power-On Reset

Identifying the power-on reset and normal power-on reset caused by unexpected situations such as power failure in the measurement and control system is particularly important for process control systems. For example, a measurement and control system that uses time as the control standard takes 1 hour to complete a measurement and control task. When the measurement and control has been performed for 50 minutes, the system voltage abnormality causes a reset. At this time, if the system is reset and the measurement and control is started from the beginning, it will cause unnecessary time consumption. Therefore, a monitoring unit can be used to monitor the current system's operating status and system time, and the control process can be decomposed into several steps or several time periods. After each step is executed or each time period is run, the monitoring unit is set to the shutdown permission value. Different tasks or different stages of tasks have different values. If the system is performing a measurement and control task or is executing a certain time period, the monitoring unit is set to an abnormal shutdown value. Then, after the system is reset, the original operating status of the system can be judged based on this unit, and the error handling program can be jumped to restore the original operating status of the system.

[1] [2]

Reference address：Anti-interference method of single chip microcomputer system software

Previous article：51 MCU bus and non-bus program comparison
Next article：6-digit frequency counter based on AT89S51

Popular Resources
Popular amplifiers