Introduction to RISC-V's PMP physical storage protection

Latest update time：2023-10-31

Reads：

1. PMP Overview

PMP is Physical Memory Protection physical memory protection, which is similar to ARM , and can be considered a simplified version . Usually, several groups of registers are provided to control the access attributes of several groups of storage blocks, generally a pair of registers that set the storage range and storage attributes. An important use in is the protection of the task stack, that is, when the context is switched, it is set that only the stack space corresponding to the current task can be accessed, so that stack overflow and pointer access exceptions can be detected. It is also possible to set some spaces, such as the kernel usage space, to be accessible only at the privileged level, so that the kernel running at the privileged level can access it, and the user program running at the non-privileged level cannot access it. Illegal access will trigger an interrupt, which will be further processed in the interrupt. Cortex-MMPUMMU(Memory Management Unit)RTOS

PMP checks are applied to all accesses where the effective permission mode is S or U.
Including instruction fetch in S and U modes,
This includes data access in S and U modes when the MPRV bit in the mstatus register is clear.
And data access in any mode when the MPRV bit in mstatus is set and the MPP field in mstatus is in the S or U state.
PMP checking also applies to page table accesses for virtual address translations where the effective permission mode is S.

Note: PMP checking can be applied to M-mode accesses (attribute L set), in which case the PMP registers themselves are locked so that even M-mode software cannot change them until the hart is reset.

PMP_M means that PMP detection is enabled in M mode. L attribute is not used because PMPO cannot be reconfigured after using L attribute and must be reset before configuration. Using PMP_M enables PMP in M mode and PMP can also be dynamically configured.

PMP violations are always captured accurately on the processor, so it is possible to accurately check which address access violation occurred through some exception registers of the CSR.

2. Related registers

The configuration of PMP consists of a set of table entries consisting of an 8-bit configuration CSR register and an MXLEN-bit address CSR register. Up to 64 sets of table entries can be configured, and the implementation is generally 0, 16 or 64 sets, with the lower sequence number implemented first.

PMP CSR can only be accessed in M mode, with WARL attribute or read-only zero, that is, writing an illegal value does not generate an exception, and even if an illegal value was written for the last time, a legal value will be returned when reading.

Since the configuration register is 8 bits, the 64 configuration items pmp0cfg–pmp63cfg are placed in 16 registers pmpcfg0–pmpcfg15. For RV64, they are placed in even-numbered registers pmpcfg0, pmpcfg2, . . . , pmpcfg14 to maintain compatibility with RV32. The lower bits are placed in the lower-numbered configurations.

The 64 addresses are placed in the 64 registers pmpadr0–pmpaddr63.

For RV32 a register encodes bits 33–2 of a 34-bit address, i.e. the 34-bit address shifted right by 2 bits.

For RV64 a register encodes bits 55–2 of a 56-bit address, i.e. the 56-bit address shifted right by 2 bits.

We have implemented 16 groups here so we can see 4 CFG registers

(gdb) info reg pmpcfg0
pmpcfg0        0x0      0
(gdb) info reg pmpcfg1
pmpcfg1        0x0      0
(gdb) info reg pmpcfg2
pmpcfg2        0x0      0
(gdb) info reg pmpcfg3
pmpcfg3        0x1100   4352
(gdb) info reg pmpcfg4
Invalid register `pmpcfg4'
(gdb)

From the above figure, we can see that pmpcfg3 is 0x1100 and pmp13cfg is 0x11. NA4 is read-only.

16 addr registers

(gdb) info reg pmpaddr0
pmpaddr0       0x0      0
(gdb) info reg pmpaddr1
pmpaddr1       0x0      0
(gdb) info reg pmpaddr2
pmpaddr2       0x0      0
(gdb) info reg pmpaddr3
pmpaddr3       0x0      0
(gdb) info reg pmpaddr4
pmpaddr4       0x0      0
(gdb) info reg pmpaddr5
pmpaddr5       0x0      0
(gdb) info reg pmpaddr6
pmpaddr6       0x0      0
(gdb) info reg pmpaddr7
pmpaddr7       0x0      0
(gdb) info reg pmpaddr8
pmpaddr8       0x0      0
(gdb) info reg pmpaddr9
pmpaddr9       0x0      0
(gdb) info reg pmpaddr10
pmpaddr10      0x0      0
(gdb) info reg pmpaddr11
pmpaddr11      0x0      0
(gdb) info reg pmpaddr12
pmpaddr12      0x0      0
(gdb) info reg pmpaddr13
pmpaddr13      0xa0e0b06        168692486
(gdb) info reg pmpaddr14
pmpaddr14      0x0      0
(gdb) info reg pmpaddr15
pmpaddr15      0x0      0
(gdb) info reg pmpaddr16
Invalid register `pmpaddr16'
(gdb)

Address 13 is 0xa0e0b06<<2 = 0x2838 2BD0

Configuration properties

The 8-bit configuration properties are as follows

Setting R, W, and X respectively means readable, writable, and executable, while setting them to 0 means the opposite.

Violation of X attribute access: Generate instruction access-fault exception

Violation of R attribute: load access-fault exception is generated by accessing through LOAD related instructions

Violation of W property: Access through STORE or AMO related instructions generates store access-fault exception.

A area code address matching mode

These modes support a 4-byte granularity

When A=0, the mismatch function is turned off.

When A=NAPOT, it means the exponential multiple of 2 is aligned.

A=NA4 is a special case of the above, i.e. 4-byte alignment.

This mode starts from the specified address (ignoring the consecutive 1s at the end), and the size is 2^(the format of the consecutive 1s at the end + 3).

When A=NAPOT, the low bits of the address register are used to encode the range and size.

The number of consecutive 1s in the low bit of pmpadr is G

The size is 2^(G+3)

For example, 0x20000003 means 2^(2+3)=32 bytes starting from 0x20000000<<2=0x80000000.

Note that the register value is address >> 2 bits.

A=TOR The boundary of any range, using two address registers to specify a start address and an end address.

The address register associated with the TOR mode represents the top boundary, and the previous PMP address register represents the bottom boundary.

For example, if the A bit of the attribute of the table entry number i is set to TOR, then pmpadr(i−1) ≤ y < pmpaddr(i) indicates the corresponding range, and the value of pmpadr(i−1) is not checked.

If pmpadr(i−1) ≥ pmpaddr(i) and pmpcfg(i).A=TOR, then PMP table entry i does not match any address.

If entry 0 is set to TOR, the range is y < pmpaddr(0).

Locking and Privilege Levels

L means locked. When set, the corresponding configuration and address registers will be ignored until the hart is reset.

Correspondingly, if pmpicfg(i).A is TOR, then in addition to pmpaddr(i), pmpaddr(i-1) is also locked.

Setting the lock of L will also take effect when the A attribute is OFF.

In addition to locking the PMP table entry, the L bit also indicates whether to enforce R/W/X permission checks in M mode. When the L bit is set, these permission checks are enforced for all permission modes. When the L bit is clear, any M-mode access matching the PMP table entry will succeed, and R/W/X permissions apply only to S and U modes.

That is, if PMP management is required in M mode, the L bit must be set. However, after the L bit is set, the configured address cannot be dynamically rewritten. In this way, dynamic switching of stack protection such as RTOS cannot be achieved, and only a few areas and attributes can be fixed?

Later revisions added PMP_M attributes?

priority

PMP table entries have a static priority. The lowest numbered PMP table entry that matches any byte of an access determines whether that access succeeds or fails. A matching PMP table entry must match all bytes of an access or the access will fail, regardless of the L, R, W, and X bit fields. For example, if a PMP table entry is configured to match the four-byte range 0xC-0xF, an eight-byte access to the range 0x8-0xF will fail, assuming the PMP table entry is the highest priority entry matching those addresses.

The address matches the continue matching attribute:

If the PMP table entry matches all bytes of the access, the L, R, W, and X bits determine whether the access succeeds or fails. If the L bit is clear and the privilege mode of the access is M, the access succeeds. Otherwise, if the L bit is set or the privilege mode of the access is S or U, the access succeeds only if the R, W, or X bit corresponding to the access type is set.

That is, when L=0, the M mode is always successful.

If no PMP table entry matches the M-mode access, the access will succeed. If no PMP table entry matches the S-mode or U-mode access, but at least one PMP table entry is implemented, the access will fail. That is, if there is PMP in U and S modes, PMP must be configured. If at least one PMP table entry is implemented, but all PMP entries "A field" are set to OFF, then all S-mode and U-mode memory accesses will fail.

A failed access will generate an Instruction, Load, or Store Access Fault exception. Note that a single instruction may generate multiple accesses, which may not be atomic with respect to each other. An Access Fault exception is generated if at least one access generated by an instruction fails, although other accesses generated by that instruction may succeed and produce visible side effects. It is worth noting that instructions that reference virtual memory are decomposed into multiple accesses.

In some implementations, unaligned loads, stores, and instruction fetches may also be decomposed into multiple accesses, some of which may succeed before an access fault exception occurs. In particular, part of an unaligned store that passes the PMP check may be visible even if another part does not pass the PMP check. The same behavior may also manifest for floating-point stores wider than XLEN bits (e.g., the FSD instruction in RV32D), even when the store address is naturally aligned.

Physical memory protection and paging

The physical memory protection mechanism is designed to work with a page-based virtual memory system. When paging is enabled, instructions that access virtual memory may result in multiple physical memory accesses, including implicit references to page tables. PMP checks apply to all of these accesses. The effective privilege mode for implicit page table accesses is S.

Implementations that use virtual memory are permitted to perform address translations implicitly and earlier than required by explicit memory accesses, and are permitted to cache them in address translation cache structures - including possibly caching identity mappings from effective addresses to physical addresses used in raw translation mode and M-mode. The PMP setting (and possibly the cache) of the resulting physical address may be checked at any point between an address translation and an explicit memory access. Therefore, M-mode software must synchronize the PMP setting with the virtual memory system and any PMP or address translation caches when the PMP setting is modified. This is accomplished by executing a SFENCE. VMA instruction with rs1=x0 and rs2=x0 after writing PMPcsr.

If page-based virtual memory is not implemented, memory accesses will synchronously check the PMP setting, so SFENCE.VMA is not needed.

3. Driver code

src/lib/riscv/inc/pmp.h

The enumeration defined in corresponds to the 8-bit attribute. There is an additional PMP_M bit, which will be explained later.

enum {
    PMP_R = 1 << 0,
    PMP_W = 1 << 1,
    PMP_X = 1 << 2,
    PMP_A = 3 << 3,
    PMP_M = 1 << 6,
    PMP_L = 1 << 7,
};

Analyze the following interfaces one by one

src/lib/riscv/src/pmp.c

/* reset PMP setting */
void reset_pmp(void);

/* set up PMP record */
void setup_pmp(uint32_t base, uint32_t size, uint32_t flags);

void pmp_isr_stack_set(void);

void pmp_zero_addr_set(void);

void pmp_task_stack_set(uint32_t addr);

void pmp_cache_in_isr_set(void);

void pmp_cache_in_isr_clear(void);

void pmp_init(void);

Define the A attribute

enum {
    PMP_OFF = 0,        /* Null (off)                            */
    PMP_TOR = 1 << 3,   /* Top of Range                          */
    PMP_NA4 = 2 << 3,   /* Naturally aligned four-byte region    */
    PMP_NAPOT = 3 << 3, /* Naturally aligned power-of-two region */
};

Define the cache size

#define CACHE_NAPOT_BIT 24

Encapsulate macros for writing CSR

#define _PMP_WRITE_CSR(index, addr) write_csr(pmpaddr##index, addr)
#define PMP_WRITE_CSR(index, addr)  _PMP_WRITE_CSR(index, addr)

The two layers of encapsulation are performed here because

## When used in a macro definition, the following parameter will not be expanded if it is also a macro.

So #define PMP_WRITE_CSR(index, addr) _PMP_WRITE_CSR(index, addr If index and addr are expanded here first, then #define _PMP_WRITE_CSR(index, addr) write_csr(pmpaddr##index, addr) they are spliced.

Defining Macros

#define USE_NAPOT
#define PMP_SHIFT   2
#define GRANULE     (1 << PMP_SHIFT)

Reserve several entries for fixed use

#define PMP_FIXED_INDEX_ISR_STACK       15
#define PMP_FIXED_INDEX_ZERO_ADDR       14
#define PMP_FIXED_INDEX_TASK_STACK      13
#define PMP_FIXED_INDEX_CACHE_IN_ISR    12

Define a data structure to save a table entry

/*
 * This structure is used to temporarily record PMP
 * configuration information.
 */
typedef struct {
    /* used to record the value of pmpcfg[i] */
    uint32_t cfg;
    /*
     * When generating a TOR type configuration,
     * the previous entry needs to record the starting address.
     * used to record the value of pmpaddr[i - 1]
     */
    uint32_t previous_address;
    /* used to record the value of pmpaddr[i] */
    uint32_t address;
} pmpcfg_t;

Allocation Management

/* This variable is used to record which entries have been used. */
static uint32_t pmp_entry_used_mask = 0;

static int pmp_entries_num(void)
{
    return 16;
}  


/*
 * find empty PMP entry by type
 * TOR type configuration requires two consecutive PMP entries,
 * others requires one.
 */
static int find_empty_pmp_entry(int is_range)
{
    int free_entries = 0;
    for (int i = 0; i < pmp_entries_num(); i++) {
        if (pmp_entry_used_mask & ((uint32_t)1 << i)) {
            free_entries = 0;
        } else {
            free_entries++;
        }
        if (is_range && (free_entries == 2)) {
            return i;
        }
        if (!is_range && (free_entries == 1)) {
            return i;
        }
    }
    // Too many PMP configurations, no free entries can be used!
    assert(0);
    return -1;/*lint !e527 Unreachable code at token 'return'. */
}  

/*
 * mark PMP entry has be used
 * this function need be used with find_entry_pmp_entry
 *
 *   n = find_empty_pmp_entry(is_range)
 *   ... // PMP set operate
 *   mask_pmp_entry_used(n);
 */
static void mask_pmp_entry_used(int idx)
{
    pmp_entry_used_mask |= (uint32_t)1 << idx;
}

static void mask_pmp_entry_free(int idx)
{
    pmp_entry_used_mask &= ~((uint32_t)1 << idx);
}

Our chip supports 16 table entries. We define a global variable to indicate which table entries have been used. One bit corresponds to one. If the bit is set, it means it has been used, otherwise it is not used.

If find_empty_pmp_entry it is_range is not 0, find two consecutive idle ones for TOR mode, otherwise find one idle one for NAPOT mode.

Read and write configuration and address registers

/* helper function used to read pmpcfg[idx] */
static uint32_t read_pmpcfg(int idx)
{
#if __riscv_xlen == 32
    int shift = 8 * (idx & 3);
    switch (idx >> 2) {  /* >>2 除以4即 一个寄存器存4个配置 */
    case 0:
        return (read_csr(pmpcfg0) >> shift) & 0xff;  /* >>shift即 即寄存器内偏移位数 8*(idx&)）*/
    case 1:
        return (read_csr(pmpcfg1) >> shift) & 0xff;
    case 2:
        return (read_csr(pmpcfg2) >> shift) & 0xff;
    case 3:
        return (read_csr(pmpcfg3) >> shift) & 0xff;
    default:
        break;
    }
#elif __riscv_xlen == 64
    int shift = 8 * (idx & 7);
    switch (idx >> 3) {
    case 0:
        return (read_csr(pmpcfg0) >> shift) & 0xff;
    case 1:
        return (read_csr(pmpcfg2) >> shift) & 0xff;
    default:
        break;
    }
#endif
    return (uint32_t) -1;
}

/* helper function used to write pmpcfg[idx] */
static void write_pmpcfg(int idx, uint32_t cfg)
{
    uint32_t old;
    uint32_t new;

#if __riscv_xlen == 32
    int shift = 8 * (idx & 3);
    switch (idx >> 2) {
    case 0:
        old = read_csr(pmpcfg0);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg0, new);
        break;
    case 1:
        old = read_csr(pmpcfg1);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg1, new);
        break;
    case 2:
        old = read_csr(pmpcfg2);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg2, new);
        break;
    case 3:
        old = read_csr(pmpcfg3);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg3, new);
        break;
    default:
        break;
    }
#elif __riscv_xlen == 64
    int shift = 8 * (idx & 7);
    switch (idx >> 3) {
    case 0:
        old = read_csr(pmpcfg0);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg0, new);
        break;
    case 1:
        old = read_csr(pmpcfg2);
        new = (old & ~((uint32_t)0xff << shift))
              | ((cfg & 0xff) << shift);
        write_csr(pmpcfg2, new);
        break;
    default:
        break;
    }
#endif

    if (read_pmpcfg(idx) != cfg)
        // write pmpcfg failure!
    {
        assert(0);
    }
}

/* helper function used to read pmpaddr[idx] */
static uint32_t read_pmpaddr(int idx)
{
    switch (idx) {
    case 0:
        return read_csr(pmpaddr0);
    case 1:
        return read_csr(pmpaddr1);
    case 2:
        return read_csr(pmpaddr2);
    case 3:
        return read_csr(pmpaddr3);
    case 4:
        return read_csr(pmpaddr4);
    case 5:
        return read_csr(pmpaddr5);
    case 6:
        return read_csr(pmpaddr6);
    case 7:
        return read_csr(pmpaddr7);
    case 8:
        return read_csr(pmpaddr8);
    case 9:
        return read_csr(pmpaddr9);
    case 10:
        return read_csr(pmpaddr10);
    case 11:
        return read_csr(pmpaddr11);
    case 12:
        return read_csr(pmpaddr12);
    case 13:
        return read_csr(pmpaddr13);
    case 14:
        return read_csr(pmpaddr14);
    case 15:
        return read_csr(pmpaddr15);
    default:
        break;
    }
    return (uint32_t) -1;
}

/* helper function used to write pmpaddr[idx] */
static void write_pmpaddr(int idx, uint32_t val)
{
    switch (idx) {
    case 0:
        write_csr(pmpaddr0, val);
        break;
    case 1:
        write_csr(pmpaddr1, val);
        break;
    case 2:
        write_csr(pmpaddr2, val);
        break;
    case 3:
        write_csr(pmpaddr3, val);
        break;
    case 4:
        write_csr(pmpaddr4, val);
        break;
    case 5:
        write_csr(pmpaddr5, val);
        break;
    case 6:
        write_csr(pmpaddr6, val);
        break;
    case 7:
        write_csr(pmpaddr7, val);
        break;
    case 8:
        write_csr(pmpaddr8, val);
        break;
    case 9:
        write_csr(pmpaddr9, val);
        break;
    case 10:
        write_csr(pmpaddr10, val);
        break;
    case 11:
        write_csr(pmpaddr11, val);
        break;
    case 12:
        write_csr(pmpaddr12, val);
        break;
    case 13:
        write_csr(pmpaddr13, val);
        break;
    case 14:
        write_csr(pmpaddr14, val);
        break;
    case 15:
        write_csr(pmpaddr15, val);
        break;
    default:
        break;
    }

    if (read_pmpaddr(idx) != val)
        // write pmpaddr failure
    {
        assert(0);
    }
}

Configuration properties

#ifdef USE_NAPOT
/* Generate a PMP configuration of type NA4/NAPOT */
static pmpcfg_t generate_pmp_napot(uint32_t base, uint32_t size, uint32_t flags)
{
    pmpcfg_t p;
    flags = flags & (PMP_R | PMP_W | PMP_X);
    p.cfg = flags | (size > GRANULE ? PMP_NAPOT : PMP_NA4) | PMP_M;
    p.previous_address = 0;
    p.address = (base + (size / 2 - 1)) >> PMP_SHIFT;
    return p;
}
#endif

The above generates an area that is a power of 2, PMP_NAPOT or PMP_NA4 mode.

Here >> PMP_SHIFT is the 32 bits mentioned above representing a 34-bit address, shifted right by 2 bits, that is, the register value is the address value shifted right by two bits.

flags | (size > GRANULE ? PMP_NAPOT : PMP_NA4) | If it is greater than 4, use PMP_NAPOT, otherwise use PMP_NA4.

size/2-1 Generates the number of consecutive 1s.

For example, if size is 32, then 32/2=16-1=0xF. Then >>PMP_SHIFT(2)=3, which means two 1s. According to Table 3.11

Then the size is 2^(2+3)=32.

Here, the parameter base passed in is required to be aligned with size, and size is an exponential multiple of 2.

/* Generate a PMP configuration of type TOR */
static pmpcfg_t generate_pmp_range(
    uint32_t base, uint32_t size, uint32_t flags)
{
    pmpcfg_t p;
    flags = flags & (PMP_R | PMP_W | PMP_X);
    p.cfg = flags | PMP_TOR | PMP_M;
    p.previous_address = base >> PMP_SHIFT;
    p.address = (base + size) >> PMP_SHIFT;
    return p;
}

The above configures a range from base start to base+size end. TOR method.

Note that the register value is address value >> 2.

/* Generate a PMP configuration */
static pmpcfg_t generate_pmp(uint32_t base, uint32_t size, uint32_t flags)
{
#ifdef USE_NAPOT
    if ((size >= 4) && IsPowerOfTwo(size) && ((base & (size - 1)) == 0)) {
        return generate_pmp_napot(base, size, flags);
    } else {
        return generate_pmp_range(base, size, flags);
    }
#else
    return generate_pmp_range(base, size, flags);
#endif
}

The above two methods are automatically selected based on the macro and size. If USE_NAPOT is greater than 4 and the address is a multiple of the size, and the size is an exponent of 2, the NAPOT method is used, otherwise the TOR method is used.

/* set up PMP record */
void setup_pmp(uint32_t base, uint32_t size, uint32_t flags)
{
    pmpcfg_t p;
    int is_range, n;

    p = generate_pmp(base, size, flags);
    is_range = ((p.cfg & PMP_A) == PMP_TOR) ? 1 : 0;

    n = find_empty_pmp_entry(is_range);

    write_pmpaddr(n, p.address);
    if (is_range) {
        write_pmpaddr(n - 1, p.previous_address);
    }
    write_pmpcfg(n, p.cfg);

    mask_pmp_entry_used(n);
    if (is_range) {
        mask_pmp_entry_used(n - 1);
    }
}

The above configuration attributes and addresses. According to generate_pmp to use TOR or NAPOT mode,

Then decide whether to operate on the two entries.

Reset

/* reset PMP setting */
void reset_pmp(void)
{
    // release lock bit first
    for (int i = 0; i < pmp_entries_num(); i++) {
#if 0
        if (read_pmpcfg(i) & PMP_L)
            // Some PMP configurations are locked and cannot be reset!
        {
            assert(0);
        }
#endif
        write_pmpcfg(i, 0);
    }

    for (int i = 0; i < pmp_entries_num(); i++) {
        write_pmpaddr(i, 0);
        mask_pmp_entry_free(i);
    }
}

Set the configuration to 0 and clear the global variable flag.

This method is only applicable to M mode. If it is U and S mode, the properties must be configured and a full coverage range can be set.

如果至少实现了一个PMP表项，但所有PMP条目“A字段”都设置为OFF，那么所有s模式和u模式内存访问都将失败。

Interrupt stack area settings

void pmp_isr_stack_set(void)
{
    extern uint32_t __stack_top;
    /*lint -esym(526,__stack_size)  __stack_size defined in file "link.lds"*/
    extern uint32_t __stack_size;

    uint32_t address = ((&__stack_top - &__stack_size) + 3) >> 2;
    uint32_t flags = PMP_NA4 | PMP_M | PMP_R;

    /* clear */
    write_pmpcfg(PMP_FIXED_INDEX_ISR_STACK, 0);

    /* index 15 */
    PMP_WRITE_CSR(PMP_FIXED_INDEX_ISR_STACK, address);
    write_pmpcfg(PMP_FIXED_INDEX_ISR_STACK, flags);
}

__stack_top and __stack_size are the stack top and size specified in the link script, which are subtracted to form the base address.

+3 is rounded up to 4-byte alignment, >>2 is the value written to the register with the address shifted right 2 bits.

The property is read-only, PMP_NA4 mode,

Use entry 15 PMP_FIXED_INDEX_ISR_STACK

Call Path

main -> main_lib_init -> pmp_init

Here, the NA4 mode sets the 4 bytes of the stack start address to be read-only, which actually implements stack overflow detection, that is, an error occurs when writing the 4 bytes at the bottom of the stack, but no error occurs when reading.

In fact, this is limited, because writing may skip the 4 bytes, so it is best to set an area (the size is configured by macros, too large a size will result in waste, and too small a size will result in weak detection capabilities), and it is best to set it so that it cannot be read or written. Read overflow is also a problem and will trigger an exception.

0Address Setting

void pmp_zero_addr_set(void)
{
    uint32_t address = 0;
    uint32_t flags = PMP_NA4 | PMP_M;

    /* clear */
    write_pmpcfg(PMP_FIXED_INDEX_ZERO_ADDR, 0);

    /* index 14 */
    PMP_WRITE_CSR(PMP_FIXED_INDEX_ZERO_ADDR, address);
    write_pmpcfg(PMP_FIXED_INDEX_ZERO_ADDR, flags);
}

main -> main_lib_init -> pmp_init

Here, the 4 bytes starting from address 0 cannot be read or written in NA4 mode, which can detect the problem of uninitialized pointer access.

Task Stack

void pmp_task_stack_set(uint32_t addr) IRAM_TEXT(pmp_task_stack_set);
void pmp_task_stack_set(uint32_t addr)
{
    uint32_t address = (addr + 3) >> 2;
    uint32_t flags = PMP_NA4 | PMP_M | PMP_R;

    /* clear */
    write_pmpcfg(PMP_FIXED_INDEX_TASK_STACK, 0);

    /* index 13 */
    PMP_WRITE_CSR(PMP_FIXED_INDEX_TASK_STACK, address);
    write_pmpcfg(PMP_FIXED_INDEX_TASK_STACK, flags);
}

vTaskSwitchContext->pmp_task_stack_set Set as pxCurrentTCB->pxStack

As with the interrupt stack, the task stack is checked for low-level write overflows.

Cache settings

void pmp_cache_in_isr_set(void) IRAM_TEXT(pmp_cache_in_isr_set);
void pmp_cache_in_isr_set(void)
{
    uint32_t flags = PMP_NAPOT | PMP_M;
#if defined(BUILD_CORE_CORE0)
    uint32_t cache_start = (DTOP_CACHE_START + (BIT(CACHE_NAPOT_BIT) - 1)) >> 2;
#elif defined(BUILD_CORE_CORE1)
    uint32_t cache_start = (BT_CACHE_START + (BIT(CACHE_NAPOT_BIT) - 1)) >> 2;
#endif

    /* clear */
    write_pmpcfg(PMP_FIXED_INDEX_CACHE_IN_ISR, 0);

    /* index 12 */
    PMP_WRITE_CSR(PMP_FIXED_INDEX_CACHE_IN_ISR, cache_start);
    write_pmpcfg(PMP_FIXED_INDEX_CACHE_IN_ISR, flags);
}

void pmp_cache_in_isr_clear(void) IRAM_TEXT(pmp_cache_in_isr_clear);
void pmp_cache_in_isr_clear(void)
{
    /* clear */
    write_pmpcfg(PMP_FIXED_INDEX_CACHE_IN_ISR, 0);
}

cache_start is the top NAPOT mode of CACHE

@todo Why is index 11 not set here? Is the start address not set? .

initialization

void pmp_init(void)
{
    pmp_isr_stack_set();
    pmp_zero_addr_set();
#if defined(LOW_POWER_ENABLE)
    iot_dev_pm_register(&pmp_pm);
#endif
}

IV. Conclusion

The above task stack overflow 0 address access ISR stack overflow detection all set the specified area non-writable attribute, and other accessible area attributes are not set. This is dependent on the M mode if the address does not match, that is, if it is not within the PMP range, the access is successful.

This cannot be done for U and S modes. In U and S modes, if the access to each area set by PMP fails, U and S modes can only set accessible areas instead of inaccessible areas. The logic is reversed.

Latest articles about

■Interpretation of opensbi link script

■"Learning FPGA at low cost based on "mining board"" Using JTAG boundary scan to quickly reverse the pin correspondence

■Detailed explanation of opensbi serial port driver

■"Learning FPGA at low cost based on "mining board"" Porting OpenC906 Extra - Introduction to iverilog+gtkwave environment

■"Learning FPGA at low cost based on "mining board"" Porting OpenC906 Part 3 - Running simulation

■"Learning FPGA at low cost based on "mining board"" Solidify the program to SPI FLASH

■"Learning FPGA at low cost based on "mining board"" Porting OpenC906 Part 2 - Constraint synthesis to generate bit files and IO expansion board design

■"Learning FPGA at low cost based on "mining board"" Porting OpenC906 Part 1 - Adding code synthesis

■"Learning FPGA at low cost based on "mining board"" Reverse scanning of all IO pin mapping of socket

■"Learning FPGA at low cost based on "mining board"" Using ILA internal logic analyzer to analyze the signal of LED project