mailto: blog -at- heyrick -dot- eu

Navi: Previous entry Display calendar Next entry
Switch to desktop version

FYI! Last read at 03:03 on 2024/12/18.

Fun with the ABC compiler

In my recent blog post BASIC weirdness, both David Pilling and Gavin Wraith pondered the degree of optimisation applied by ABC. David commented on the speed of ABC doing very little and wondered if it is optimising, while Gavin talked about hashing and asked Does ABC do this?

I looked at ABC generated code a long time ago and noted that it was functional rather than intelligent. To this end, I'm going to pull apart the timed pH test loop program to see exactly what was generated, and to demonstrate that... optimisation? Rather an unknown concept!

ABC, itself, is starting to lose utility. Not because of the non-standard (read "broken") support for LOCAL variables, but because it is drifting ever further from standard BASIC. Hell, it can't even cope with COLOUR r%, g%, b% yet, and that was added to BASIC in '94 for RISC OS 3.50! I've asked, a lot, for the ABC sources to try to bring the language more up to date... but there's a point when you just give up asking.
ABC, which is now supplied with the DDE, is the only solution for compiling BASIC on modern 32 bit systems. There are arguments against compiling, but if you need greater speed than pure BASIC or if you wish to write a commercial product in BASIC and you're worried about being ripped off...ABC is still there for you.

Important: If you don't know what LDR R0, [R13, #-4] does, give up now. This is the geek equivalent of grindcore.

 

Still with me? Okay, here's the test program in BASIC:. Nothing special.

REM >pHtest

PROCtest(2)
PROCtest(7)
PROCtest(12)
END

DEFPROCtest(A%)
  PRINT "Testing evaluation speed with pH value of ";A%
  T% = TIME : REPEAT : UNTIL (T% <> TIME)
  T% = TIME
    FOR l% = 1 TO 1000000
    CASE TRUE OF
      WHEN A% < 6 : REM Acid
      WHEN A% > 8 : REM Base
      OTHERWISE   : REM Okay
    ENDCASE
  NEXT
  PRINT "Case took ";TIME - T%
  :
  T% = TIME : REPEAT : UNTIL (T% <> TIME)
  T% = TIME
    FOR l% = 1 TO 1000000
    IF A% < 6 THEN
      REM "Acid"
    ELSE
      IF A% > 8 THEN
        REM Base
      ELSE
        REM Good
      ENDIF
    ENDIF
  NEXT
  PRINT "Nested IF took ";TIME - T%
ENDPROC

 

Now, here's an annotated disassembly of the business part of the executable, as generated by ABC 4.14:

[runtime jump table and such is above this]

000088E4 : .*Te : 65542A00 : LDRVSB  R2,[R4,#-2560]      ; String "Testing evaluation speed with pH value of "
000088E8 : stin : 6E697473 : MCRVS   CP4,3,R7,C9,C3,3
000088EC : g ev : 76652067 : STRVCBT R2,[R5],-R7,RRX
000088F0 : alua : 61756C61 : Undefined instruction
000088F4 : tion : 6E6F6974 : MCRVS   CP9,3,R6,C15,C4,3
000088F8 :  spe : 65707320 : LDRVSB  R7,[R0,#-800]!
000088FC : ed w : 77206465 : STRVC   R6,[R0,-R5,ROR #8]!
00008900 : ith  : 20687469 : RSBCS   R7,R8,R9,ROR #8
00008904 : pH v : 76204870 : Undefined instruction
00008908 : alue : 65756C61 : LDRVSB  R6,[R5,#-3169]!
0000890C :  of  : 20666F20 : RSBCS   R6,R6,R0,LSR #30
00008910 : ..Ne : 654E0F00 : STRVSB  R0,[R14,#-3840]     ; String "Nested IF took "
00008914 : sted : 64657473 : STRVSBT R7,[R5],#-1139
00008918 :  IF  : 20464920 : SUBCS   R4,R6,R0,LSR #18
0000891C : took : 6B6F6F74 : BLVS    &01BE46F4
00008920 :  ... : 00000020 : ANDEQ   R0,R0,R0,LSR #32
00008924 : ..AR : 52410300 : SUBPL   R0,R1,#&00,6        ; "ARM"?
00008928 : M... : 0000004D : ANDEQ   R0,R0,R13,ASR #32
0000892C : ..Ca : 61430A00 : Undefined instruction
00008930 : se t : 74206573 : STRVCT  R6,[R0],#-1395      ; String "Case took "
00008934 : ook  : 206B6F6F : RSBCS   R6,R11,PC,ROR #30
There's no terminator as the byte immediately prior to the string is the length.

Program start - this corresponds to the beginning of the BASIC program.
> PROCtest(2)
00008938 : .... :. E3A00002 : MOV     R0,#2
0000893C : .... : EB00000B : BL      &00008970           ; PROCtest(2)
> PROCtest(7)
00008940 : .... : E3A00007 : MOV     R0,#7               
00008944 : .... : EB000009 : BL      &00008970           ; PROCtest(7)
> PROCtest(12)
00008948 : .... : E3A0000C : MOV     R0,#&0C             
0000894C : .... : EB000007 : BL      &00008970           ; PROCtest(12)
> END
00008950 : .... : EB000003 : BL      &00008964           ; ?? empty function?
00008954 : .... : E28F004C : ADR     R0,&000089A8       
00008958 : .... : E2800B01 : ADD     R0,R0,#&0400        ; ADRL to &8DA8 (is null word)
0000895C : .... : E35F0000 : CMP     PC,#0               ; Whuh?
00008960 : .... : EA00010C : B       &00008D98           ; Branch to routine to call OS_Exit

Empty routine?
This is called prior to exiting. It appears to stack R0 and return, and then
immediately unstack them (returning in the process). Uhhh?
00008964 : .... : E92D4001 : STMDB   R13!,{R0,R14}
00008968 : .... : E8BD8001 : LDMIA   R13!,{R0,PC}

Junk? The above pushes R14 into PC so it implicitly returns. Nothing appears to call this.
0000896C : .... : EAFFFEEE : B       &0000852C           ; call 88th ABCLib function

; PROCtest(<var>)
> DEFPROCtest(A%)
00008970 : .... : E50B0004 : STR     R0,[R11,#-4]        ; function init stuff (R0 on entry is parameter)
00008974 : .... : E24B9014 : SUB     R9,R11,#&14       
00008978 : .... : E59CA000 : LDR     R10,[R12,#0]        ; (looks like stack checking code to me)
0000897C : .... : E24AA04C : SUB     R10,R10,#&4C      
00008980 : .... : E59AA000 : LDR     R10,[R10,#0]
00008984 : .... : E159000A : CMP     R9,R10
00008988 : .... : 4AFFFEEB : BMI     &0000853C
0000898C : .... : E8894800 : STMIA   R9,{R11,R14}        ; Store stack pointer and return address
00008990 : .... : E1A0B009 : MOV     R11,R9              ; then swap temp with stack
00008994 : .... : E3A09000 : MOV     R9,#0
00008998 : .... : E58B9008 : STR     R9,[R11,#8]
0000899C : .... : E58B900C : STR     R9,[R11,#12]
000089A0 : .... : EF02002C : SWI     XOS_ReadEscapeState
000089A4 : .... : 2AFFFEE3 : BCS     &00008538
> PRINT "Testing evaluation speed with pH value of ";A%
000089A8 : .... : E24F10CC : ADR     R1,&000088E4        ; PRINT "Testing evaluation [...]"
000089AC : .... : E3A0A000 : MOV     R10,#0
000089B0 : .... : EBFFFEE4 : BL      &00008548
000089B4 : .... : E3A0100A : MOV     R1,#&0A             ; Print integer value of parameter
000089B8 : .... : E3A02001 : MOV     R2,#1
000089BC : .... : E59B0010 : LDR     R0,[R11,#16]        ; Read initial entry value
000089C0 : .... : EBFFFED0 : BL      &00008508
000089C4 : .... : EF020003 : SWI     XOS_NewLine         ; And finish with a newline
000089C8 : .... : 6AFFFED5 : BVS     &00008524

> T% = TIME
000089CC : .... : EBFFFE86 : BL      &000083EC           ; Read TIME
000089D0 : .... : E58C0060 : STR     R0,[R12,#96]        ; Assign T%
> REPEAT : UNTIL (T% <> TIME)
TimeLoopOne:
000089D4 : .... : EF02002C : SWI     XOS_ReadEscapeState
000089D8 : .... : 2AFFFED6 : BCS     &00008538
000089DC : .... : EBFFFE82 : BL      &000083EC           ; Read TIME
000089E0 : .... : E59C1060 : LDR     R1,[R12,#96]        ; Reload stored TIME
000089E4 : .... : E0300001 : EORS    R0,R0,R1            ; Same? [result is zero if they match]
000089E8 : .... : 13E00000 : MVNNE   R0,#0               ; Set result TRUE if they're different
000089EC : .... : E3300000 : TEQ     R0,#0               ; Was the result FALSE (same)?
000089F0 : .... : 0AFFFFF7 : BEQ     &000089D4           ; go to TimeLoopOne if same
> T% = TIME
000089F4 : .... : EBFFFE7C : BL      &000083EC           ; Read TIME
000089F8 : .... : E58C0060 : STR     R0,[R12,#96]        ; Assign T%
> FOR l% = 1 [TO 1000000]
000089FC : .... : E3A00001 : MOV     R0,#1               ; Set initial value of loop counter to 1
00008A00 : .... : E58C008C : STR     R0,[R12,#140]       ; Assign l%

CaseLoop:
> CASE TRUE OF -> WHEN A% < 6
00008A04 : .... : E59B0010 : LDR     R0,[R11,#16]        ; Read entry value
00008A08 : .... : E3500006 : CMP     R0,#6               ; Compare with 6 for "CASE A% < 6"
00008A0C : .... : A3A00000 : MOVGE   R0,#0               ; It's FALSE if >= so set result to 0
00008A10 : .... : B3E00000 : MVNLT   R0,#0               ; It's TRUE if < so set result to -1
00008A14 : .... : E3E01000 : MVN     R1,#0               ; Set CASE test value to -1 (TRUE)
00008A18 : .... : E1300001 : TEQ     R0,R1               ; Compare test value (TRUE) with result
00008A1C : .... : 1A000000 : BNE     &00008A24           ; If comparison FALSE, skip to next clause
00008A20 : .... : EA000007 : B       &00008A44           ; Comparison TRUE, drop out of CASE
> CASE TRUE OF -> WHEN A% > 8
00008A24 : .... : E59B0010 : LDR     R0,[R11,#16]        ; Load entry value (again)
00008A28 : .... : E3500008 : CMP     R0,#8               ; Compare with '8'
00008A2C : .... : D3A00000 : MOVLE   R0,#0               ; This time, <= is zero (FALSE)
00008A30 : .... : C3E00000 : MVNGT   R0,#0               ; and > is -1 (TRUE)
00008A34 : .... : E3E01000 : MVN     R1,#0               ; Load -1 TRUE for comparison
00008A38 : .... : E1300001 : TEQ     R0,R1               ; Same?
00008A3C : .... : 1A000000 : BNE     &00008A44           ; If so... wait... WTF is this?
00008A40 : .... : EAFFFFFF : B       &00008A44           ; Haha! The result of no code! :-)

DropOutOfCase:
> [FOR l% = 1] TO 1000000
00008A44 : .... : EF02002C : SWI     XOS_ReadEscapeState ; See if anything interesting happened
00008A48 : .... : 2AFFFEBA : BCS     &00008538
00008A4C : .... : E59C108C : LDR     R1,[R12,#140]       ; Pick up our loop counter (l%)
00008A50 : .... : E2811001 : ADD     R1,R1,#1            ; Increment it
00008A54 : .... : E58C108C : STR     R1,[R12,#140]       ; Write it back (ARM is a Load/Store CPU!)
00008A58 : .... : E3A00D09 : MOV     R0,#&0240           ; Construct "100000" in R0
00008A5C : .... : E380093D : ORR     R0,R0,#&000F4000
00008A60 : .... : E1510000 : CMP     R1,R0               ; Compare loop with 1000000
00008A64 : .... : DAFFFFE6 : BLE     &00008A04           ; If <= then back to CaseLoop

> PRINT "Case took ";TIME - T%
00008A68 : .... : E24F1F51 : ADR     R1,&0000892C        ; PRINT "Case took "
00008A6C : .... : E3A0A000 : MOV     R10,#0
00008A70 : .... : EBFFFEB4 : BL      &00008548
00008A74 : .... : EBFFFE5C : BL      &000083EC           ; Read TIME
00008A78 : .... : E59C1060 : LDR     R1,[R12,#96]        ; Read stored T%
00008A7C : .... : E0400001 : SUB     R0,R0,R1            ; Calculate (TIME - T%)
00008A80 : .... : E3A0100A : MOV     R1,#&0A             ; Print integer for duration
00008A84 : .... : E3A02001 : MOV     R2,#1
00008A88 : .... : EBFFFE9E : BL      &00008508
00008A8C : .... : EF020003 : SWI     XOS_NewLine
00008A90 : .... : 6AFFFEA3 : BVS     &00008524

> T% = TIME
00008A94 : .... : EBFFFE54 : BL      &000083EC           ; Read TIME
00008A98 : .... : E58C0060 : STR     R0,[R12,#96]        ; Assign T%
> REPEAT : UNTIL (T% <> TIME)
TimeLoopTwo:
00008A9C : .... : EF02002C : SWI     XOS_ReadEscapeState
00008AA0 : .... : 2AFFFEA4 : BCS     &00008538
00008AA4 : .... : EBFFFE50 : BL      &000083EC           ; Read TIME
00008AA8 : .... : E59C1060 : LDR     R1,[R12,#96]        ; Reload stored TIME
00008AAC : .... : E0300001 : EORS    R0,R0,R1            ; Same?
00008AB0 : .... : 13E00000 : MVNNE   R0,#0
00008AB4 : .... : E3300000 : TEQ     R0,#0
00008AB8 : .... : 0AFFFFF7 : BEQ     &00008A9C           ; Loop to TimeLoopTwo if same
> T% = TIME
00008ABC : .... : EBFFFE4A : BL      &000083EC           ; Read TIME
00008AC0 : .... : E58C0060 : STR     R0,[R12,#96]        ; Assign T%

> FOR l% = 1 [TO 1000000]
00008AC4 : .... : E3A00001 : MOV     R0,#1               ; Set initial loop counter
00008AC8 : .... : E58C008C : STR     R0,[R12,#140]       ; Assign l%

IfLoop:
> IF A% < 6 THEN
00008ACC : .... : E59B0010 : LDR     R0,[R11,#16]        ; Load entry value
00008AD0 : .... : E3500006 : CMP     R0,#6               ; Compare with 6
00008AD4 : .... : AA000000 : BGE     &00008ADC           ; If >= then skip to next clause
00008AD8 : .... : EA000003 : B       &00008AEC           ; Otherwise < (match) then drop out of IF
> ELSE
> IF A% > 8 THEN
00008ADC : .... : E59B0010 : LDR     R0,[R11,#16]        ; Load entry value
00008AE0 : .... : E3500008 : CMP     R0,#8               ; Compare with 8
00008AE4 : .... : DA000000 : BLE     &00008AEC           ; If <= then skip an instruction
00008AE8 : .... : EAFFFFFF : B       &00008AEC           ; Otherwise > (match) so drop out of IF

Which means both instructions point to here! :-)
DropOutOfIF:
DropOutOfCase:
> [FOR l% = 1] TO 1000000
00008AEC : .... : EF02002C : SWI     XOS_ReadEscapeState ; User wants out?
00008AF0 : .... : 2AFFFE90 : BCS     &00008538
00008AF4 : .... : E59C108C : LDR     R1,[R12,#140]       ; Pick up loop counter l%
00008AF8 : .... : E2811001 : ADD     R1,R1,#1            ; Increment it
00008AFC : .... : E58C108C : STR     R1,[R12,#140]       ; Write it back
00008B00 : .... : E3A00D09 : MOV     R0,#&0240           ; Construct 1000000
00008B04 : .... : E380093D : ORR     R0,R0,#&000F4000
00008B08 : .... : E1510000 : CMP     R1,R0               ; Compare l% with 1000000
00008B0C : .... : DAFFFFEE : BLE     &00008ACC           ; If <= then back to IfLoop

> PRINT "Nested IF took ";TIME - T%
00008B10 : .... : E24F1F82 : ADR     R1,&00008910        ; PRINT "Nested IF took "
00008B14 : .... : E3A0A000 : MOV     R10,#0
00008B18 : .... : EBFFFE8A : BL      &00008548
00008B1C : .... : EBFFFE32 : BL      &000083EC           ; Read TIME
00008B20 : .... : E59C1060 : LDR     R1,[R12,#96]        ; Pick up T%
00008B24 : .... : E0400001 : SUB     R0,R0,R1            ; Work out TIME - T%
00008B28 : .... : E3A0100A : MOV     R1,#&0A             
00008B2C : .... : E3A02001 : MOV     R2,#1
00008B30 : .... : EBFFFE74 : BL      &00008508           ; Print the integer
00008B34 : .... : EF020003 : SWI     XOS_NewLine         ; And a newline
00008B38 : .... : 6AFFFE79 : BVS     &00008524
; ENDPROC
00008B3C : .... : E89B8800 : LDMIA   R11,{R11,PC}        ; Pick up original stack pointer and return

; This code doesn't seem to be executed, but it's worth noting
; that it appears to be a copy of the code that represents END,
; along with the empty function, as seen at &8950.
; Is this automatically appended to every program "just in case"
; to exit tidily?
00008B40 : .... : EB000003 : BL      &00008B54
00008B44 : .... : E28F0F97 : ADR     R0,&00008DA8
00008B48 : .... : E2800000 : ADD     R0,R0,#0            ; notice an ADRL with nothing to add!
00008B4C : .... : E35F0000 : CMP     PC,#0
00008B50 : .... : EA000090 : B       &00008D98
00008B54 : .... : E92D4001 : STMDB   R13!,{R0,R14}
00008B58 : .... : E8BD8001 : LDMIA   R13!,{R0,PC}

; Executable initialisation begins here
; (first word in program branches to here)
00008B5C : .... : E28F1D09 : ADR     R1,&00008DA4         
00008B60 : .... : E2811000 : ADD     R1,R1,#0            ; another ADRL that isn't necessary!
00008B64 : .... : E581E000 : STR     R14,[R1,#0]
00008B68 : .... : E28FAFC7 : ADR     R10,&00008E8C
00008B6C : .... : E28AA000 : ADD     R10,R10,#0
00008B70 : .... : EF020010 : SWI     XOS_GetEnv
[init stuff continues...]

 

As I said, I don't have the ABC sources, so we can only guess at the logic that is being used in the compilation, suffice to say that we can make several observations. As this is a blog, such observations naturally follow. ☺

Firstly, there is absolutely no attempt at register optimisation. It looks as if ABC 'prefers' to use the least number of registers possible, to the point where the compiled code appears to only ever use R0 or R1 (with some assignment of R2 for the print function).
This means that ABC generated code is quite heavy on the loads and stores.
A good example of this is the "T% = TIME : REPEAT : UNTIL (T% <> TIME) part which looks like this:

  BL      ABC_ReadTime
  STR     R0,[R12,#96]
.loop
  SWI     XOS_ReadEscapeState
  BCS     ABC_EscapeHandler
  BL      ABC_ReadTime
  LDR     R1,[R12,#96]
  EORS    R0,R0,R1
  MVNNE   R0,#0
  TEQ     R0,#0
  BEQ     loop
Entire books have been written on the ways to optimise the output of compilers, however something that stands out here is the LDR. The ARM processor offers R3 to R8 that could be available for use. I mean, it's not a 6502!
Having said that, register allocation is complicated. ABC is a much simpler beast than an optimising C compiler...

The next issue is that logical comparisons are based upon a BASIC-like concept of TRUE (-1) or FALSE (0). This is why all that EORS, MVNNE, TEQ rubbish at the end of the function. The EORS will result in zero if the values are the same, the following MVN will convert this to TRUE (different) or FALSE (same). The TEQ afterwards compares with FALSE, and if so, will loop.
As opposed to, say...CMP R0,R1 to directly compare this time with that time, resulting in a BEQ loop to loop around if they are the same.

There is actually a feasible potential explanation for this. It is quite possible that the code we see is built up as follows:

.loop
  SWI     XOS_ReadEscapeState  / this built by
  BCS     ABC_EscapeHandler    \ loop start code

  BL      ABC_ReadTime         - call TIME
  LDR     R1,[R12,#96]         - load T%

  EORS    R0,R0,R1             / this builty by
  MVNNE   R0,#0                \ an expression parser

  TEQ     R0,#0                / this built by
  BEQ     loop                 \ loop tail code
With this in mind, the expression parser is obliged to return a TRUE or FALSE result, because all it knows is that we want to check non-equality (this <> that), it doesn't know what is going to be done with that result. It's conjecture, but logical.

This comes up again in the difference between the CASE code and the IF code. The tests are fundamentally the same, however the IF code can get it done in eight instructions as it is able to directly compare a value with a constant. If A% is less than six can be expressed pretty much as that (by rejecting A% being greater than or equal). The CASE code, on the other hand, works with expressions, so it needs to work out the comparison, then evaluate it in terms of TRUE or FALSE, and then compare with TRUE. That's why it takes sixteen instructions (twice as many) to perform the same logic. An ARM processor clocking 900MHz, you won't notice twice as many instructions in the million passes, because they'll be cached and probably nothing compared to the SWI overheads - reading the escape state doesn't do much (it loads a flag from page zero; EscapeState in RiscOS.Sources.Kernel.s.Kernel), it's the SWI mechanism itself that takes the most time here. I replaced the XOS_ReadEscapeState calls with NOPs (and the BCC afterwards too) and the results were rather more interesting:

*ABCpHTest2
Testing evaluation speed with pH value of 2
Case took 1
Nested IF took 2
Testing evaluation speed with pH value of 7
Case took 2
Nested IF took 1
Testing evaluation speed with pH value of 12
Case took 3
Nested IF took 1
*
With this sort of speed, you can start to see that CASE is slower.

 

All said and done, it appears to be clear that how ABC translates BASIC to an executable is to simply evaluate each keyword in turn and output some code to deal with said keyword. No optimisation. No intelligence. Just a translation from BASIC into executable. That being said, compiling something as ill-suited to compilation as BASIC into executable code is a thankless task. There's a reason RISC OS had more web browsers than BASIC compilers in the year 2000!

I say "simple". Because it's easy to forget the things one can abuse with PRINT.

REM >PrintAbuse
REM

PRINT FNthis+" "+FNis+" "+FNwhat(3)+CHR$(ASC("!"))
END

DEFFNwhat(choice%)
  CASE choice% OF
    WHEN 1 : =FNcrazy
    WHEN 2 : =FNnuts
    WHEN 3 : =FNmental
    WHEN 4 : =FNstupid
  ENDCASE
="something"

DEFFNthis   : = "This"
DEFFNis     : = "is"
DEFFNcrazy  : = "crazy"
DEFFNnuts   : = "nuts"
DEFFNmental : = "mental"
DEFFNstupid : = "stupid"

Oh, no. You do not want to look at what ABC did with that. ☺

 

One final thing

I tried giving functionally similar code to the Norcroft (DDE) compiler. The CASE statement was nicely converted to a jump table, but missing a trick here, every entry went to the same target. Oops!
        SUB      a1,v1,#1          ; pick up value stashed in R4
|L000094.J6.test|
        CMP      a1,#&d        ; pH is range 1 to 14
        ADDLS    pc,pc,a1,LSL #2
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
        B        |L0000d4.J7.test|
|L0000d4.J7.test|
        ADD      a4,a4,#1
        SUBS     ip,a4,#&f4000 ; sort out 1-1000000 loop 
        SUBGES   ip,ip,#&240
        BLE      |L000094.J6.test|

The IF version of the code is even shorter, namely:

|L000114.J28.test|
        ADD      a2,a2,#1
        SUBS     ip,a2,#&f4000 ; sort out 1-1000000 loop 
        SUBGES   ip,ip,#&240
        BLE      |L000114.J28.test|
Yup, it simply spins an empty loop as it's realised that the IF code is essentially pointless.

 

 

Your comments:

Gavin Wraith, 20th June 2019, 09:48
Very interesting. Norcroft and GCC appear to use different strategies for compiling case statements. Norcroft makes a table of BL instructions; then it calculates the offset and branches into the table. GCC makes a table of addresses, calculates the offset and then loads the PC with the address, which would appear to be more efficient. However, I may be overlooking things here concerning ARM-specific quirks, the need for trampolines, and such.
Gavin Wraith, 20th June 2019, 09:51
Sorry, typo! That should have read B instructions, not BL instructions.
Rick, 20th June 2019, 10:44
Off the top of my head - the B version is restricted (+/- 64MiB or something) in where it can branch, however it is entirely position independent. 
Loading addresses into PC, on the other hand, allows for any address, but said addresses are fixed. 
 
And yes, on the face of it, a compare followed by a load is going to be more efficient than a compare followed by a branch followed by a branch; and while this may have big implications for module code, most application code runs from a fixed location so.....
David Pilling, 20th June 2019, 13:24
ABC then is simple. There must be inefficiency in the interpreted version. The compiler is worthwhile. My experiences were on BBC Micro, where speed ups were hard to come by. Thirty years on I am sorry I didn't try writing a Basic compiler. One could produce BBC Basic 2020 (have to get a move on), modern techniques, faster, better. 
Gavin Wraith, 20th June 2019, 14:09
Perhaps I should have said relative address. GCC loads the PC with LDRls PC,[PC,R4,LSL #2] where R4 is the offset into the table of addresses of routines relative to the load instruction - more or less.

Add a comment (v0.11) [help?]
Your name:

 
Your email (optional):

 
Validation:
Please type 23678 backwards.

 
Your comment:

 

Navi: Previous entry Display calendar Next entry
Switch to desktop version

Search:

See the rest of HeyRick :-)