Prevention of Program Out of Control Based on MCS-96 Single Chip Microcomputer Control System-EEWORLD

Collect

1 Introduction

When a single chip is used to form a control system, after satisfying various control functions, in order to make the system practical, its reliability must be improved. However, due to the harsh industrial environment, it is often affected by factors such as the start and stop of electromagnetic equipment and power waveform distortion. Various interferences are inevitable. It is difficult to meet the requirements if only error avoidance design is relied upon, and it is also difficult to ensure that these hardware are foolproof. The system must be equipped with fault-tolerant functions. Therefore, anti-interference design, fault-tolerant design (including fault detection and diagnosis technology) and functional design have become important and indispensable contents of single chip system design.

In industrial sites, interference will not cause damage to the hardware of the single-chip system in most cases, but will mainly cause adverse effects on the software operation. Its main characteristics are: the instruction code or digital code is interfered with, causing errors in the execution of the program. The most typical errors are: 1) The value of the program counter PC in the CPU jumps, causing the program to run away and execute a meaningless or erroneous program segment, causing the system to be confused or out of control, which may cause equipment damage or even endanger personal safety in serious cases; 2) The output port is illegally operated by the out-of-control program, causing the control quantity to fluctuate or the system to "freeze"; 3) The RAM area is interfered with, causing the data to be destroyed, causing the system to run abnormally and output errors. Taking the real-time control system composed of the MCS-96 series of single-chip microcomputers as an example, some effective and practical measures to prevent program out of control are proposed.

2 Methods for capturing the running program

2.1 Instruction Redundancy

The most susceptible to interference in the microcontroller is the value of the internal program counter - PC. When subject to strong interference, the value of PC is changed, and the changed value is a random uncertain value. This allows the CPU to run the program from the correct location to any address in the ROM. When the PC value flies to the user's working program ROM area, the method of instruction redundancy can be used to put the program on the right track. The specific method is: 1) Insert several NOP instructions before some instructions that play a decisive role in the flow of the program, such as SJMP, LJMP, LCALL, CALL, etc.; 2) Insert several NOP instructions before some instructions that are critical to the operation of the system, such as interrupts, stacks, etc.; 3) Insert a NOP instruction every several instructions in the program; 4) Insert one or two NOP instructions before multi-byte instructions.

Since most microcontroller instructions are single-byte instructions, the number of interrupt and stack instructions used in a program is also limited, so this method will not increase the number of storage units by too much. When there is an unused area, such as the unused space in EPROM or the data table area in the program, the software trap method is often used to get the program back on track.

The so-called software trap is a boot instruction that forcibly directs the program to a specified address where there is a special error handling procedure. Assuming that the entry label of the program is EPROM, the software trap consists of the following three instructions:

NOP
NOP
LJMP ERROR

In addition to being installed in the unused user EPROM area, the software trap is often placed in the unused interrupt vector area, the end of the table area, and after the breakpoint of the program (the breakpoint refers to instructions such as LJMP, SJMP, RET, etc.).

2.3 Watchdog (WATCHDOGTIMER)

When the runaway program neither falls into the software trap nor encounters redundant instructions, but jumps back and forth between user programs or in the address space that the user does not use at all, it automatically forms an infinite loop. The solution to this problem is to use software to start the monitoring timer of the microcontroller, commonly known as the "watchdog". When the above situation occurs, use it to reset the system. This method is simple and intuitive. It only takes no more than 64K state cycle (16ms) (when using a 12M crystal oscillator) for the computer to return to normal. But at this time, the WATCHDOG TIMER must be reset once every certain period of time (such as 15ms) through software. [page]

3. No-disturbance recovery

The above measures only solve how to find out that the system is disturbed and how to capture the out-of-control program, which is sufficient for ordinary single-chip microcomputer application systems such as patrol detection and display. However, in some key industrial control systems, due to the logic and sequentiality of the working process and production process, when the program is out of control, it is hoped that the system will be guided to resume execution of the program module where the out-of-control just occurred, and it is not hoped or even allowed to re-execute the program from the entrance. More importantly, out-of-control programs are often scribbled, which will not only destroy some important information, but also perform illegal operations on the output port. In this case, the above methods are too incomplete. Therefore, how to restore the important information of the system and re-enter the normal working state as undisturbed as possible is a problem that must be solved, and it is also a relatively difficult problem to solve.

3.1 How to select the startup mode using software

There are two ways to reset: initial reset and re-reset. The former is usually called "cold start" and the latter is called "hot start". During a "cold start", all system states are invalid and require a thorough initialization operation. A "hot start" only repairs and selectively initializes the current state of the system, so that the system can return to normal as quickly as possible. When the system is powered on for the first time and put into operation, it must be a "cold start". During operation, resets caused by anti-interference measures are generally "hot starts". In order for the system to correctly decide which startup method to use, the software often uses a "power-on flag" to distinguish. The system entry program design strategy is shown in Figure 3-1.

In order to ensure a smooth "hot start", you must first disable interrupts, reset the stack, set all I/O ports to a safe state, block I/O operations to prevent the situation from escalating, and then restore information and re-enter the state.

[page]

3.2 Methods for quickly restoring the disturbed program to normal operation

System software is composed of programs that complete various functions, so it can be divided into several functional modules. In order to enable the program to quickly re-enter the normal operating state of the system, the system software must first be compiled into a modular structure and subdivided into several functional modules as much as possible. Each functional module must have the function of writing and recording during operation, that is, setting the valid flag of the RAM area; recording the number and the first address; recording the critical and non-retrievable data; the module should also have the function of sending pulses to the operation monitoring system, etc. In order to determine whether the program has run away, it is required to compare the flag saved in the specified unit with the flag pre-set in this functional module at the end of each functional module. If they are different, the program has run away, and then it will be restored to the functional module corresponding to the flag saved in the specified unit to re-execute; if they are the same, it runs normally. For the program running away in the functional module, the rationality of the result can be analyzed and judged according to the specific situation. If it is unreasonable, it will return to re-execute; if it is reasonable, it will enter the next functional module. The program flow chart with this function is shown in Figure 3-2.

3.3 Method of realizing RAM content self-rescue by using data redundancy technology

In order to ensure that the system can return to normal operation without disturbance, the correctness of important data must be guaranteed. The method to achieve this goal is to use data redundancy technology.

In the real-time control process, interference will cause the data in RAM to be destroyed. There are generally three types of data destruction: 1) The entire RAM area is destroyed; 2) A piece of data in RAM is destroyed; 3) Individual data is destroyed. Since RAM stores various original data, flags, variables, etc., if they are destroyed, the system will fail or fail to run. However, for almost all single-chip real-time control systems, most of the content in RAM is temporarily stored for analysis, calculation, and comparison, and the data that is not allowed to be lost only accounts for a very small part of the RAM content. In this case, except for those data that are not allowed to be lost, most of the rest of the content is allowed to be destroyed for a short time, which at most causes a very short fluctuation in the system, but it can quickly return to normal. Therefore, in real-time software, it is only necessary to pay attention to the protection of a small number of data that are not allowed to be lost. Commonly used methods are "verification method" and "marking method". These two methods have their own advantages. The verification method is more cumbersome, but the confidence of error detection is high; the marking method is simple, but it is powerless to deal with the situation where individual data in the data table is destroyed. They should be used in combination in programming. The specific method is: 1) set a flag code "0" or "1" at the beginning and end of the important area of the RAM working area; 2) set a check word for the fixed data table in the RAM.

During the execution of the program, the error checking program designed in advance is used to check whether each flag code is normal at regular intervals. If it is abnormal, the data redundancy technology is used to correct it through the anti-interference processing program. The general principle of redundant design is: back up the data three times in different areas of the RAM area that are as far apart as possible and away from the stack area. When reading data, compare the three data backups and use the voting principle of 2 out of 3 to ensure the correctness of the data.

3.4 How to lock the output port

In order to prevent the out-of-control program from making abnormal operations on the output port, causing fluctuations in the control quantity and damaging the safety of the system, the operation of the output port must be strictly reviewed. The solution is to use a locking controller in hardware and a function block flag and password in software.

The locking controller is implemented by two D flip-flops, as shown in Figure 3-3.

Normally, the output terminals Q1 and Q2 of the two locking controllers are both low level, and as long as one of the signals of Q1 and Q2 is low level, the output channel is in a blocked state. Only when Q1 and Q2 are both high level, the channel is opened. In order to prevent the program from illegally writing to the output channel, the program usually closes the output channel through the port control signal and setting Q1 and Q2 to low level. Only when output is required, the program opens the output channel through the port control signal and setting Q1 and Q2 to high level. When the program outputs, it is necessary to give an output command first. The output module program flow chart is shown in Figure 3-4.

4 Conclusion

The above measures can effectively improve the reliability of system operation and obtain satisfactory control effects. With slight modifications, they can be used in other types of single-chip microcomputer control systems, and are highly practical and versatile.

[References]

［1］ Wang Xingzhi. Anti-interference technology of single-chip microcomputer application system［M］． Beijing University of Aeronautics and Astronautics Press, 2000, 2．
［2］ Zhou Hangci. Programming technology of single-chip microcomputer application system［M］． Beijing University of Aeronautics and Astronautics Press, 1991, 7．
［3］ Liu Damao. Intelligent instrument［M］． Machinery Industry Press, 1998, 5

Reference address：Prevention of Program Out of Control Based on MCS-96 Single Chip Microcomputer Control System

Previous article：Automatic positioning system based on 80C196MC single chip microcomputer
Next article：Code Protection for Flash-Based Microcontrollers in Code Distribution

Popular Resources
Popular amplifiers