CH32V307-Execution Speed

From Stm32World Wiki
Jump to navigation Jump to search


While executing some tests, I noticed that an extra layer of function call added much more time than should be the case.

In the TIM7_IRQHandler I did

__attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
    GET_INT_SP();
    rt_interrupt_enter();
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
    set_testpoint(0);
    clear_testpoint(0);
    demiurge_tick();
    TIM7->INTFR = 0;
    rt_interrupt_leave();
    FREE_INT_SP();
}

and the set_testpoint(0) and clear_testpoint(0) functions as follows;

void set_testpoint(int point)
{
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
}
void clear_testpoint(int point)
{
    GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
}

It is clear that there is literally only one extra call. In assembler we can see that the compiler has not inlined any of this code;


 21                    __attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
 22 002c 73110134          GET_INT_SP();
 23                        rt_interrupt_enter();
 33                            csrrw sp,mscratch,sp
 34                    # 0 "" 2
 35                     #NO_APP
 36 0030 37140140              li      s0,1073811456
 37 0034 97000000              call    rt_interrupt_enter
 37      E7800000 
 38 003c 13050480              addi    a0,s0,-2048
 39 0040 0546                  li      a2,1
 40 0042 93050008              li      a1,128
 41 0046 97000000              call    GPIO_WriteBit
 41      E7800000 
 42 004e 0146                  li      a2,0
 43 0050 93050008              li      a1,128
 44 0054 13050480              addi    a0,s0,-2048
 45 0058 97000000              call    GPIO_WriteBit
 45      E7800000 
 46 0060 0145                  li      a0,0
 47 0062 97000000              call    set_testpoint
 47      E7800000 
 48 006a 0145                  li      a0,0
 49 006c 97000000              call    clear_testpoint
 49      E7800000 
 50 0074 97000000              call    demiurge_tick
 50      E7800000 
 51 007c B7170040              li      a5,1073745920
 52 0080 23980740              sh      zero,1040(a5)
 53 0084 97000000              call    rt_interrupt_leave
 53      E7800000 
-------
 29                    set_testpoint:
 30 0000 17030000              call    t0,__riscv_save_0
 30      E7020300 
 31 0008 37150140              li      a0,1073811456
 32 000c 0546                  li      a2,1
 33 000e 93050008              li      a1,128
 34 0012 13050580              addi    a0,a0,-2048
 35 0016 97000000              call    GPIO_WriteBit
 35      E7800000 
 36 001e 17030000              tail    __riscv_restore_0
 36      67000300 
 38                            .section        .text.clear_testpoint,"ax",@progbits
 39                            .align  1
 40                            .globl  clear_testpoint
 42                    clear_testpoint:
 43 0000 17030000              call    t0,__riscv_save_0
 43      E7020300 
 44 0008 37150140              li      a0,1073811456
 45 000c 0146                  li      a2,0
 46 000e 93050008              li      a1,128
 47 0012 13050580              addi    a0,a0,-2048
 48 0016 97000000              call    GPIO_WriteBit
 48      E7800000 
 49 001e 17030000              tail    __riscv_restore_0
 49      67000300 


But on the oscilloscope, I get a 70ns pulse first and a 220ns pulse there after. So why does

 46 0060 0145                  li      a0,0
 47 0062 97000000              call    set_testpoint


take 150ns to execute? For the GPIO_WriteBit we have 5 32-bit and 1 16-bit instruction, and inside GPIO_WriteBit we get 3 assembly instructions. At 144MHz, each fetch cycle is ~7ns, so 7*9 = 63ns which is very close to what I observe. So the

  li a0,0
  call set_testpoint

should only take 14 or maybe 21 ns, not 150ns!!!

And in fact, the following code takes exactly the same time to execute

   GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
   uint16_t ch1 = (uint16_t) ((10.0f - outputs[0]) * 204.7f);
   uint16_t ch2 = (uint16_t) ((10.0f - outputs[1]) * 204.7f);
   
   DAC->RD12BDHR = ch1 + (ch2 << 16);
   GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);

as the simple additional function level

   set_testpoint(0);
   clear_testpoint(0);


So why is this happening??

Hypotheses

  1. Different memory types, fetch happening at different speeds. Perhaps because the code is inside a IRQ Handler.
  2. Because it is Saturday and was raining earlier.