CH32V307-Execution Speed

From Stm32World Wiki
Revision as of 08:22, 3 July 2022 by Niclas (talk | contribs) (Created page with " = Execution Speed = While executing some tests, I noticed that an extra layer of function call added much more time than should be the case. In the '''TIM7_IRQHandler''' I...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Execution Speed

While executing some tests, I noticed that an extra layer of function call added much more time than should be the case.

In the TIM7_IRQHandler I did

__attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
    GET_INT_SP();
    rt_interrupt_enter();
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
    set_testpoint(0);
    clear_testpoint(0);
    demiurge_tick();
    TIM7->INTFR = 0;
    rt_interrupt_leave();
    FREE_INT_SP();
}

and the set_testpoint(0) and clear_testpoint(0) functions as follows;

void set_testpoint(int point)
{
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
}
void clear_testpoint(int point)
{
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
}

It is clear that there is literally only one extra call. In assembler we can see that the compiler has not inlined any of this code;


 21                    __attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
 22 002c 73110134          GET_INT_SP();
 23                        rt_interrupt_enter();
 33                            csrrw sp,mscratch,sp
 34                    # 0 "" 2
 35                     #NO_APP
 36 0030 37140140              li      s0,1073811456
 37 0034 97000000              call    rt_interrupt_enter
 37      E7800000 
 38 003c 13050480              addi    a0,s0,-2048
 39 0040 0546                  li      a2,1
 40 0042 93050008              li      a1,128
 41 0046 97000000              call    GPIO_WriteBit
 41      E7800000 
 42 004e 0146                  li      a2,0
 43 0050 93050008              li      a1,128
 44 0054 13050480              addi    a0,s0,-2048
 45 0058 97000000              call    GPIO_WriteBit
 45      E7800000 
 46 0060 0145                  li      a0,0
 47 0062 97000000              call    set_testpoint
 47      E7800000 
 48 006a 0145                  li      a0,0
 49 006c 97000000              call    clear_testpoint
 49      E7800000 
 50 0074 97000000              call    demiurge_tick
 50      E7800000 
 51 007c B7170040              li      a5,1073745920
 52 0080 23980740              sh      zero,1040(a5)
 53 0084 97000000              call    rt_interrupt_leave
 53      E7800000 
-------
 29                    set_testpoint:
 30 0000 17030000              call    t0,__riscv_save_0
 30      E7020300 
 31 0008 37150140              li      a0,1073811456
 32 000c 0546                  li      a2,1
 33 000e 93050008              li      a1,128
 34 0012 13050580              addi    a0,a0,-2048
 35 0016 97000000              call    GPIO_WriteBit
 35      E7800000 
 36 001e 17030000              tail    __riscv_restore_0
 36      67000300 
 38                            .section        .text.clear_testpoint,"ax",@progbits
 39                            .align  1
 40                            .globl  clear_testpoint
 42                    clear_testpoint:
 43 0000 17030000              call    t0,__riscv_save_0
 43      E7020300 
 44 0008 37150140              li      a0,1073811456
 45 000c 0146                  li      a2,0
 46 000e 93050008              li      a1,128
 47 0012 13050580              addi    a0,a0,-2048
 48 0016 97000000              call    GPIO_WriteBit
 48      E7800000 
 49 001e 17030000              tail    __riscv_restore_0
 49      67000300 


But on the oscilloscope, I get a 70ns pulse first and a 220ns pulse there after. So why does

 46 0060 0145                  li      a0,0
 47 0062 97000000              call    set_testpoint


take 150ns to execute? For the GPIO_WriteBit we have 5 32-bit and 1 16-bit instruction, and inside GPIO_WriteBit we get 3 assembly instructions. At 144MHz, each fetch cycle is ~7ns, so 7*9 = 63ns which is very close to what I observe. So the

  li a0,0
  call set_testpoint

should only take 14 or maybe 21 ns, not 150ns!!!

So why is this happening??

Hypotheses

  1. Different memory types, fetch happening at different speeds. Perhaps because the code is inside a IRQ Handler.
  2. Because it is Saturday and was raining earlier.