Difference between revisions of "CH32V307-Execution Speed"
(One intermediate revision by the same user not shown) | |||
Line 106: | Line 106: | ||
should only take 14 or maybe 21 ns, not 150ns!!! | should only take 14 or maybe 21 ns, not 150ns!!! | ||
+ | |||
+ | And in fact, the following code takes exactly the same time to execute | ||
+ | |||
+ | GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET); | ||
+ | uint16_t ch1 = (uint16_t) ((10.0f - outputs[0]) * 204.7f); | ||
+ | uint16_t ch2 = (uint16_t) ((10.0f - outputs[1]) * 204.7f); | ||
+ | |||
+ | DAC->RD12BDHR = ch1 + (ch2 << 16); | ||
+ | GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET); | ||
+ | |||
+ | as the simple additional function level | ||
+ | |||
+ | set_testpoint(0); | ||
+ | clear_testpoint(0); | ||
+ | |||
'''So why is this happening??''' | '''So why is this happening??''' |
Latest revision as of 08:46, 3 July 2022
While executing some tests, I noticed that an extra layer of function call added much more time than should be the case.
In the TIM7_IRQHandler I did
__attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() { GET_INT_SP(); rt_interrupt_enter(); GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET); GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET); set_testpoint(0); clear_testpoint(0); demiurge_tick(); TIM7->INTFR = 0; rt_interrupt_leave(); FREE_INT_SP(); }
and the set_testpoint(0) and clear_testpoint(0) functions as follows;
void set_testpoint(int point) { GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET); } void clear_testpoint(int point) { GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET); }
It is clear that there is literally only one extra call. In assembler we can see that the compiler has not inlined any of this code;
21 __attribute__((interrupt("WCH-Interrupt-fast"))) void TIM7_IRQHandler() { 22 002c 73110134 GET_INT_SP(); 23 rt_interrupt_enter(); 33 csrrw sp,mscratch,sp 34 # 0 "" 2 35 #NO_APP 36 0030 37140140 li s0,1073811456 37 0034 97000000 call rt_interrupt_enter 37 E7800000 38 003c 13050480 addi a0,s0,-2048 39 0040 0546 li a2,1 40 0042 93050008 li a1,128 41 0046 97000000 call GPIO_WriteBit 41 E7800000 42 004e 0146 li a2,0 43 0050 93050008 li a1,128 44 0054 13050480 addi a0,s0,-2048 45 0058 97000000 call GPIO_WriteBit 45 E7800000 46 0060 0145 li a0,0 47 0062 97000000 call set_testpoint 47 E7800000 48 006a 0145 li a0,0 49 006c 97000000 call clear_testpoint 49 E7800000 50 0074 97000000 call demiurge_tick 50 E7800000 51 007c B7170040 li a5,1073745920 52 0080 23980740 sh zero,1040(a5) 53 0084 97000000 call rt_interrupt_leave 53 E7800000 ------- 29 set_testpoint: 30 0000 17030000 call t0,__riscv_save_0 30 E7020300 31 0008 37150140 li a0,1073811456 32 000c 0546 li a2,1 33 000e 93050008 li a1,128 34 0012 13050580 addi a0,a0,-2048 35 0016 97000000 call GPIO_WriteBit 35 E7800000 36 001e 17030000 tail __riscv_restore_0 36 67000300 38 .section .text.clear_testpoint,"ax",@progbits 39 .align 1 40 .globl clear_testpoint 42 clear_testpoint: 43 0000 17030000 call t0,__riscv_save_0 43 E7020300 44 0008 37150140 li a0,1073811456 45 000c 0146 li a2,0 46 000e 93050008 li a1,128 47 0012 13050580 addi a0,a0,-2048 48 0016 97000000 call GPIO_WriteBit 48 E7800000 49 001e 17030000 tail __riscv_restore_0 49 67000300
But on the oscilloscope, I get a 70ns pulse first and a 220ns pulse there after. So why does
46 0060 0145 li a0,0 47 0062 97000000 call set_testpoint
take 150ns to execute? For the GPIO_WriteBit we have 5 32-bit and 1 16-bit instruction, and inside GPIO_WriteBit we get 3 assembly instructions. At 144MHz, each fetch cycle is ~7ns, so 7*9 = 63ns which is very close to what I observe. So the
li a0,0 call set_testpoint
should only take 14 or maybe 21 ns, not 150ns!!!
And in fact, the following code takes exactly the same time to execute
GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_SET); uint16_t ch1 = (uint16_t) ((10.0f - outputs[0]) * 204.7f); uint16_t ch2 = (uint16_t) ((10.0f - outputs[1]) * 204.7f); DAC->RD12BDHR = ch1 + (ch2 << 16); GPIO_WriteBit(GPIOA, GPIO_Pin_7, Bit_RESET);
as the simple additional function level
set_testpoint(0); clear_testpoint(0);
So why is this happening??
Hypotheses
- Different memory types, fetch happening at different speeds. Perhaps because the code is inside a IRQ Handler.
- Because it is Saturday and was raining earlier.