In the development of automotive electronics ECUs, one concept that must be mentioned is functional safety. For engineers engaged in automotive electronics software development, the level of functional safety determines the level of ECU design and development. At the same time, the standard for functional safety development ISO26262 , and also defines the specifications for automotive electronic ECU development from various stages. The module introduced today is a module closely related to functional safety. It is a module related to the monitoring and protection mechanism of Timing in functional safety in ISO26262.
1. ISO26262 requirements for Timing
Appendix D of ISO26262-6:2018 puts forward three requirements for freedom from interference at the software level, namely Memory, Timing and Exchange Of Information. This article focuses on introducing the protection mechanisms related to Timing, and puts forward the following points of concern regarding the requirements of Timing:
The monitoring and protection of timing proposed by ISO26262 are mainly implemented in CP AUTOSAR by two main functional blocks. They are the SC4 level time protection mechanism of the OS, and the use of the entire WdgM function station to monitor time.
Among them, the time protection of OS mainly protects time at the Task level, analyzing and protecting the execution time of Task and the time of nested interrupts. This function will be reflected in the introduction of OS later.
This article mainly introduces WdgM's time monitoring and protection mechanism. WdgM mainly implements time protection and monitoring at three levels: Alive Supervisor, Deadline Supervisor and Program Flow Supervisor.
2. Overview of WdgM functions
WdgM is mainly used for time-related monitoring during program execution. In the CP AUTOSAR architecture, the monitored entities are called Supervised Entities (SE), and the specific location of the detected SE is called Checkpoint.
The concept of monitoring is mainly reflected in three aspects, namely monitoring the execution frequency of a specific SE to avoid too fast or too slow execution frequency; monitoring the deadlines of two Checkpoints and monitoring specific program flow (execution logic of multiple Checkpoints) .
The SE monitored by W
dgM can be a Function related to functional safety, or it can be a SWC or CDD or even a BSW module. The monitoring modules provided in CP AUTOSAR mainly include WdgM, WdgIf, Internal Wdg Driver, and External Wdg Driver.
According to the system definition and software architecture design, you can select the corresponding internal and external drivers to monitor SE. The following figure shows the entire logic of WdgM monitoring. Set the corresponding Checkpoint in the relevant SWC or CDD, and then periodically trigger the WdgM service. , thereby monitoring the program flow, and deviations in monitoring occurrence time will trigger the Wdg Driver to trigger the corresponding Reaction.
The OS in CP AUTOSAR ECU is expanded based on the OSEK operating system. All extended functions are allocated to different operating system extension types, represented by SC1-SC4, where SC1 means only using the schedule function; SC2 includes the schedule and time protection; SC3 includes memory protection and the schedule; SC4 includes Schedule/time protection and memory protection. At the same time, the operating system can also support multi-core processors.
Alive Supervision is mainly used to monitor periodic functions or tasks, mainly to prevent the execution frequency of periodic functions or tasks from being too fast or too slow. Alive Supervision mainly includes the following configuration parameters:
WdgMExpectedAliveInidications:
Mainly used to define the number of notifications SE expects in a Checkpoint
WdgMSupervisionReferenceCycle:
Mainly used to define the period of the SE reference
WdgMMinMargin, WdgMMaxMargin:
Define the upper and lower limits of the number of SE Checkpoint executions respectively.
Every time WdgM_CheckpointReached of SE is called, the Alive Counter of the corresponding Checkpoint will be incremented by 1, and the main function will detect the number of Alive Counter in WdgMSupervisionReferenceCycle.
As long as the Alive Counter falls within the (Expected – Min Margin; Expected + Max Margin) range during the period, the SE is considered to be in normal mode. If the Alive Counter is less than (Expected – Min Margin), the monitored SE is considered to be executing too slowly. On the contrary If the Alive Counter is greater than (Expected + Max Margin), it is considered that SE executes too fast.
Deadline Supervision is mainly used to monitor non-periodic running SE. It mainly defines the checkpoint of executing the corresponding SE within a specific time window after an event occurs. It is generally considered that the minimum and maximum time after the event occurs are defined. If the corresponding Checkpoint is executed within the event, the program is considered to be a normal execution. If the time to execute the Checkpoint of the relevant SE after the event occurs is less than the minimum time, or greater than the maximum time to execute the Checkpoint of the SE, it is considered an error. The following parameters are mainly configured in the definition:
WdgMDeadlineStartRef:
Reference event for the start of Deadline Supervision
WdgM_CheckpointReached:
Checkpoint monitored by the final Deadline Supervision
WdgMDeadlineMin:
The minimum time allowed from the Ref event to WdgM_CheckpointReached. If it is shorter than this time, it is considered an error.
WdgMDeadlineMax:
The maximum time allowed from the Ref event to WdgM_CheckpointReached, longer than this time is also considered an error
Logical Supervision is also called program flow monitoring, which is mainly used to monitor whether the program is executed according to the correct logical conversion conditions. For each Logical Supervision, there is a Graph to represent the conversion relationship between various Checkpoint points in the SE on the control flow. Here is a simple program flow in SE as follows:
CP0_0 i = 0;
CP0_1 while(i < n)
{
CP0_2 if (a[i] < b[i])
CP0_3 a[i] = b[i];
CP0_4 else
a[i] = 0;
CP0_5 i++;
CP0_6 }
For this simple program flow, the program flow monitoring from CP0_0 to CP0_6 can be defined. A conversion Graph can be defined to monitor the code program flow and conversion relationship, as shown below:
Logical monitoring mainly includes two types of Graphs, namely internal Graph and external Graph. In the internal Graph, all Checkpoint connections and transformations of the SE belong to one Graph. Through the internal transformation in the Graph, for an SE, there can be 0 or one internal Graph; for the external Graph, at least There are two Checkpoints belonging to different SEs, and the connection and conversion of Checkpoints are implemented through external conversion.
6. Local Status & Global Status
In WdgM, each SE has its own Local Status to represent the status of its own SE's Alive/Deadline/Logic Supervision. At the same time, WdgM also has a global Global Status to represent the status of the entire monitoring function.
After WdgM initialization is completed, the Local Status and Global Status monitored by each sub-function of each SE are in the OK state. The Local Status and Global Status of each SE include OK, DEACTIVATED, FAILED, and EXPIRED status.
When each SE function is monitored, the corresponding Local Status will be set in the MainFunction based on the monitoring results. Alive Supervision has a separate status setting, while Deadline and Logic Supervision share a Local Status.
When used, the corresponding status can be set in MainFunction according to the three monitoring design conditions of each SE. At the same time, MainFunction outputs the corresponding Global Status according to the status of all SEs defined. If the final Global Status is wrong, The user can think that the system time or the function scheduling function has caused an error in the program, and then can trigger the corresponding error processing and fault response.
The following figure shows the entire WdgM state management. For specific WdgM state switching, you can refer to the CP AUTOSAR WdgM standard.
7. Error Handling And Reaction
WdgM's error handling related to time and program execution mainly includes the following aspects;
① When an error in Global Status is detected, the specific SWC module or CDD module can be notified through the callback function, so that the upper application module can handle the recovery mechanism.
② Report the error status to DEM and conduct unified fault management and processing through DEM
③ Reset the corresponding function and the Partition where the related Task is located, or directly Shutdown the corresponding Partition.
④ Call the error handling function to reset the ECU software
⑤ Reset the MCU through the external Wdg module and cut off the power supply to the MCU.
For specific error handling logic, the response should be combined with the overall functional safety goals.
8. Aurix Tricore Wdg Timeout monitoring implementation
In addition to the above time-related monitoring, Timeout during program execution is also an important monitoring. There are four Wdg modules in the Aurix Tricore chip, one of which is a security watchdog, and each of the three cores has its own Wdg.
Usually when in use, the Wdg of each core can be used as the WdgM of each core to monitor whether the corresponding program has Timeout. An external Wdg can be used to monitor the operation of the entire MCU. At the same time, the chip also has a dedicated security management module SMU to manage security-related functions. It implements security management and security response to time and program logic execution based on the Wdg used and the Alarm associated with the corresponding SMU.
When a Timeout occurs, you can restore it in the SMU or call relevant functions or interrupt interfaces to notify the User. In the end, if the recovery is unsuccessful, you can directly reset the software. You can also associate the relevant external devices through the corresponding pin (eg: Error Pin). ASIC chip, performs MCU power-off after Timeout recovery fails, as shown below:
That’s it for sharing in this issue. Everyone is welcome to communicate and learn together. Please correct me if there are any unreasonable things. I am willing to learn and progress with everyone and do a good job in automotive electronic software development step by step.
Submission cooperation: 18918250345 (WeChat)
"Repost" "Watching" Please supportEND