CH32V307-Execution Speed
While executing some tests, I noticed that an extra layer of function call added much more time than should be the case.
In the TIM7_IRQHandler I did
__attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
GET_INT_SP();
rt_interrupt_enter();
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
set_testpoint(0);
clear_testpoint(0);
demiurge_tick();
TIM7->INTFR = 0;
rt_interrupt_leave();
FREE_INT_SP();
}
and the set_testpoint(0) and clear_testpoint(0) functions as follows;
void set_testpoint(int point)
{
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET);
}
void clear_testpoint(int point)
{
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
}
It is clear that there is literally only one extra call. In assembler we can see that the compiler has not inlined any of this code;
21 __attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() {
22 002c 73110134 GET_INT_SP();
23 rt_interrupt_enter();
33 csrrw sp,mscratch,sp
34 # 0 "" 2
35 #NO_APP
36 0030 37140140 li s0,1073811456
37 0034 97000000 call rt_interrupt_enter
37 E7800000
38 003c 13050480 addi a0,s0,-2048
39 0040 0546 li a2,1
40 0042 93050008 li a1,128
41 0046 97000000 call GPIO_WriteBit
41 E7800000
42 004e 0146 li a2,0
43 0050 93050008 li a1,128
44 0054 13050480 addi a0,s0,-2048
45 0058 97000000 call GPIO_WriteBit
45 E7800000
46 0060 0145 li a0,0
47 0062 97000000 call set_testpoint
47 E7800000
48 006a 0145 li a0,0
49 006c 97000000 call clear_testpoint
49 E7800000
50 0074 97000000 call demiurge_tick
50 E7800000
51 007c B7170040 li a5,1073745920
52 0080 23980740 sh zero,1040(a5)
53 0084 97000000 call rt_interrupt_leave
53 E7800000
-------
29 set_testpoint:
30 0000 17030000 call t0,__riscv_save_0
30 E7020300
31 0008 37150140 li a0,1073811456
32 000c 0546 li a2,1
33 000e 93050008 li a1,128
34 0012 13050580 addi a0,a0,-2048
35 0016 97000000 call GPIO_WriteBit
35 E7800000
36 001e 17030000 tail __riscv_restore_0
36 67000300
38 .section .text.clear_testpoint,"ax",@progbits
39 .align 1
40 .globl clear_testpoint
42 clear_testpoint:
43 0000 17030000 call t0,__riscv_save_0
43 E7020300
44 0008 37150140 li a0,1073811456
45 000c 0146 li a2,0
46 000e 93050008 li a1,128
47 0012 13050580 addi a0,a0,-2048
48 0016 97000000 call GPIO_WriteBit
48 E7800000
49 001e 17030000 tail __riscv_restore_0
49 67000300
But on the oscilloscope, I get a 70ns pulse first and a 220ns pulse there after. So why does
46 0060 0145 li a0,0 47 0062 97000000 call set_testpoint
take 150ns to execute? For the GPIO_WriteBit we have 5 32-bit and 1 16-bit instruction, and inside GPIO_WriteBit we get 3 assembly instructions. At 144MHz, each fetch cycle is ~7ns, so 7*9 = 63ns which is very close to what I observe. So the
li a0,0 call set_testpoint
should only take 14 or maybe 21 ns, not 150ns!!!
And in fact, the following code takes exactly the same time to execute
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET); uint16_t ch1 = (uint16_t) ((10.0f - outputs[0]) * 204.7f); uint16_t ch2 = (uint16_t) ((10.0f - outputs[1]) * 204.7f); DAC->RD12BDHR = ch1 + (ch2 << 16); GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
as the simple additional function level
set_testpoint(0); clear_testpoint(0);
So why is this happening??
Hypotheses
- Different memory types, fetch happening at different speeds. Perhaps because the code is inside a IRQ Handler.
- Because it is Saturday and was raining earlier.