Special facilities
|
We all know that BASIC uses...
When you CALL or USR an assembly language segment, BASIC sets the registers as follows:
R0 | A% |
R1 | B% |
R2 | C% |
R3 | D% |
R4 | E% |
R5 | F% |
R6 | G% |
R7 | H% |
R8 | Pointer to BASIC's workspace (ARGP) |
R9 | Pointer to list of l-values for the parameters |
R10 | Number of parameters |
R11 | Pointer to BASIC's string accumulator (STRACC) |
R12 | BASIC's LINE pointer (points to current statement) |
R13 | Pointer to BASIC's stack (full, descending - as RISC OS uses) |
R14 | Return address, and environment information pointer |
CALL mycode, pointer%, string$, real
For each variable given, two words (word aligned) are used. The first is known as the l-value. In
English, this is the address where the value of this variable is stored. The second word is a
descriptor.
The list is in reverse order, the last variable passed to CALL is the first given in the list. In
this way, R9 always points to the last entry in the list. The pointer is always valid, even when
the variables counter (R10) is zero.
The possible variable types are:
Type | BASIC | l-value points to | in English... |
&00 | ?factor | byte-aligned byte | Pointer to actual byte |
&04 | !factor integer% integer%(n) | byte-aligned word word-aligned word word-aligned word | Pointer to four-byte integer (may not be word-aligned) |
&05 | |factor real real(n) | byte-aligned FP value (5 bytes) | Pointer to five-byte floating point value |
&08 | word-aligned FP value (8 bytes) | Pointer to 8-byte floating point value (BASIC VI) | |
&80 | string$ string$(n) | byte-aligned SIB (5 bytes) | Pointer to string information block |
&81 | $factor | byte-aligned byte-string (CR terminated) | Pointer to string |
&100 + &04 | integer%() | word-aligned array pointer | Pointer to word-aligned word. If array unallocated, or LOCAL but not yet DIMed, this word is less than 16. Otherwise, this word points to the array structure. |
&100 + &05 &100 + &08 | real() | ||
&100 + &80 | string$() |
The String Information Block is comprised of four bytes comprising the address of the string, followed by a byte giving the length of the string.
The word array structure (for types &100 + <something>) is a word aligned list of integer subscript sizes (the values in the DIM, plus one) terminated by a zero word, followed by a word which contains the total number of elements in the array, followed by the entries in the array.
Here is an example of passing a string to CALL, and printing it in assembler. Notice how we need to use LDRB to load the string pointer, as the address is byte aligned.
REM >stringprnt REM REM String passing (via CALL) demonstration REM REM By Richard Murray REM Downloaded from http://www.heyrick.co.uk/assembler/ REM codesize% = 180 : REM The code is 180 bytes DIM code% codesize% PROCbuild_code INPUT "Please enter some text: "my_string$ PRINT '"Entering assembler..." CALL print_string, my_string$ PRINT "...returned from assembler"' END : DEFPROCbuild_code FOR loop% = 8 TO 10 STEP 2 P% = code% L% = code% + codesize% [ OPT loop% \ Note... This is coded for CLARITY, not speed! .print_string CMP R10, #1 ; Check one parameter was given BNE wrong_parameters LDR R0, [R9] ; Load pointer LDR R1, [R9, #4] ; Load type CMP R1, #&80 ; Is it a string? BNE wrong_var_type \ Word giving string pointer may not be word aligned, so we \ cannot use an LDR as results for non-aligned addresses \ are unpredictable... LDRB R1, [R0], #1 LDRB R2, [R0], #1 ADD R1, R1, R2, LSL#8 LDRB R2, [R0], #1 ADD R1, R1, R2, LSL#16 LDRB R2, [R0], #1 ADD R1, R1, R2, LSL#24 LDRB R2, [R0], #4 ; Length CMP R2, #0 ; Check length BEQ exit \ Now, R1 is string pointer and R2 is string length. .loop LDRB R0, [R1], #1 SWI "OS_WriteC" SUBS R2, R2, #1 ; SUBS sets Z bit when = 0 BNE loop .exit SWI "OS_NewLine" ; So a blank string prints just that... MOV PC, R14 .wrong_parameters SWI "OS_WriteS" EQUS "Incorrect number of parameters"+CHR$0 ALIGN B passed_to_call .wrong_var_type SWI "OS_WriteS" EQUS "Wrong variable type"+CHR$0 ALIGN ; B passed_to_call ; Not required, will fall through .passed_to_call \ rudimentary optimisation! <g> SWI "OS_WriteS" EQUS " passed to call"+CHR$13+CHR$10+CHR$0 ALIGN MOV PC, R14 ] NEXT ENDPROC
R14
is the Link Register to return from your
code to BASIC.
Following that are a list of words which are offsets from the ARGP (in R8).
For example, you might want to know the current value of PAGE without tying up a register to
pass that value your code. PAGE is available at offset &08, so your code would be something
like:
LDR R0, [R14, #8]
LDR R0, [R8, R0]
Offset | Name | Meaning |
&00 | RETURN | Return address to BASIC |
&04 | STRACC | String accumulator (256 bytes long) |
&08 | PAGE | The current value of PAGE |
&0C | TOP | The current value of TOP |
&10 | LOMEM | The current start of variable storage |
&14 | HIMEM | The current stack end |
&18 | MEMLIMIT | Limit of available memory |
&1C | FSA | Free space start (end of variables/stack limit) |
&20 | TALLY | Value of COUNT |
&24 | TRACEF |
TRACE FILE handle, or 0 if no file being TRACEd to Four words follow the trace file handle, that may be of use to you: + 4 LOCALARLIST - a pointer to the list of local arrays+ 8 INSTALLLIST - a pointer to the list of installed libraries+12 LIBRARYLIST - a pointer to the list of transient libraries+16 OVERPTR - a pointer to the overlay structureLibraries are stored as a word which is the pointer to the next library, or 0 to end. This word is followed by the BASIC program which is the library. Before OVERLAY has been executed, OVERPTR is zero. Afterwards, OVERPTR contains a pointer to the following structure: OVERPTR+&00 Pointer to base of OVERLAY array (such as 'lib$(0)')OVERPTR+&04 Index of current OVERLAY file (or -1 if none loaded)OVERPTR+&08 Total allowed size of OVERLAY areaOVERPTR+&0C Start of current OVERLAY file in memory
|
&28 | ESCWORD | Exception flag word (contains escflg, trcflg) |
&2C | WIDTHLOC | The value of WIDTH - 1 |
Then come branches to internal BASIC routines... | ||
&30 | VARIND |
Get value of l-value On entry: R0 = Address to load variable from R9 = Type of variable (as in CALL parameter block) R12 = LINEReturns with R0 - R3 as the value (or F0 in BASIC VI), R9 the type. R9 = 0 - String; STRACC is start, R2 is end, [R2]-STRACC is the lengthR9 = &40000000 - Interger; in R0R9 = &80000000 - Float; in R0...R3Registers preserved. |
&34 | STOREA |
Store a variable, optionally converting between formats On entry: R0...R3 = Value (or F0 if float in BASIC VI) R4 = Address to store at R5 = Type of variable (as in CALL parameter block) R8 = ARGP R9 = Type of value R12 = LINE R13 = Stack pointerReturns with R0 - R7 corrupted. |
&38 | STSTORE |
Store a string into a string variable On entry: R2 = Length (address of byte beyond the last one) R3 = Address of start of string R4 = Address of l-value (ie, where to store it) R8 = ARGP R9 = Type of value R12 = LINE R13 = Stack pointerCorrupts R0, R1, R5, R6 and R7. String must start on a word boundary, and length must be 255 or less. |
&3C | LVBLNK |
Looks up a variable by name On entry: R8 = ARGP R11 = Pointer to start of name R12 = LINE R13 = Stack pointerMay use the stack. Uses all registers. If variable (more precisely, l-value) was found, returns with: Z flag = 0, R0 = address of l-value, and R9 = Type of l-value If not found, returns with: Z flag = 1 C flag = 1 if no way string could be a variable (such as "%value") C flag = 0 if could be a variable, but no such variable exists at present If not found, and could be a variable, the registers are set up ready for a call to CREATE. BASIC's documentation does not explicitly state, but by prefixing a name with the token for PROC (&F2) or FN (&A4), it may be possible to look up function/procedure names. |
&40 | CREATE |
Create a new variable. Input is failure of LVBLNK to find something. It is recommended that you only call CREATE after a failed LVBLNK, with code such as: STMFD R13!, {R14} Returns same result as LVBLNK when l-value found. Uses all registers. |
&44 | EXPR |
Evaluates an expression pointed to by R11. On entry: R8 = ARGP R11 = Pointer to start of string R12 = LINE R13 = Stack pointerEXPR stops after reading one expression (like those in a PRINT statement). Returns with R0 - R3 as the value (or F0 in BASIC VI), R9 the type. R9 = 0 - String; STRACC is start, R2 is end, [R2]-STRACC is the lengthR9 = &40000000 - Interger; in R0R9 = &80000000 - Float; in R0...R3R10 = First character of the expressionR11 = Pointer to next character after R10.
Additionally: Z set means expression was a string, else expression was a numberIf Z clear, then N set means expression was a floating point number,
else expression was an integer.A useful thing about EXPR is that it can call BASIC functions. You do this as you would in an EVAL statement, by evaluating a string containing the name of a user-defined function. For example "FNget_next_directory_entry" . This allows you to call routines
which perform a task that would be tedious in assembler - such as input a floating point number
from the user.Unfortunately, there is a complication. The string to be evaluated should be tokenised. So you can either call MATCH, or (possibly preferably) store the string pre-tokenised. The token for FN is &A4. |
&48 | MATCH |
Takes a text string and tokenises it to another string. On entry: R1 = Points to source string (ASCII 10 or 13 terminated) R2 = Points to destination string R3 = MODE R4 = CONSTA R13 = Stack pointerMODE is 0 for left mode, which is for a statement at the start of a line, or before
an equals; and 1 for right-mode, in an expression.This is important, consider the following: var = TIME It's the same word - TIME - but there are two different tokens for TIME, one for
reading time and one for writing it.CONSTA is 0 if you do not want BASIC to convert numbers which could be line numbers
(0 to 65279) to internal format; and 1 if you do.Internal format is the token &8D followed by three bytes containing the encoded line number. The advantage of the encoded numbering is the bytes lie in the range 64-127, so do not contain any tokens or control codes. These tokens are used after GOTO, GOSUB, RESTORE, THEN and ELSE. They are fixed length, so the program can be RENUMBERed without shuffling lines around. Both MODE and CONSTA may be updated during the use of this function. For example, PRINT will change MODE to 1 to read an expression.Corrupts R0-R5. On exit, R1 and R2 are left pointing one byte beyond the terminating control CR codes of the strings. Additionally, R5 contains status information. Typically, values larger than &1000 imply mismatched brackets; and ( (R5 AND 255)=1 ) means mismatched quotes. |
&4C | TOKENADDR |
This converts a token value to a pointer to the text string that represents it. On entry: R0 = Token value (ie, &A4 for FN) R12 = Pointer to next byte of tokenReturns in R1 a pointer to the first character of the string, terminated by a value &7F or greater. R0 is updated to point to the base of the token table. The value of R12 is only used when matching a two-byte token. No other registers are used or required. |
If you are using BASIC V, additional floating point operations are available. R0...R3 contain an expanded floating point value, and R9 points to a packed floating point value (as accessed with the | operator). | ||
&54 | 9 | This is a word giving the number of additional routines that are available. |
&58 | FSTA |
Store a four-word FP value into a five-byte variable. On entry: R0...R3 = Source FP value R9 = Pointer destination valueOn exit, R2 may be altered, but this doesn't affect the FP value. |
&5C | FLDA |
Load a five-byte variable into a four-word FP value. On entry: R9 = Pointer source valueOn exit, R0...R3 contain the loaded value. |
&60 | FADD |
Add the four-word FP value in R0...R3 by the variable pointed to by R9. Notically: (R0...R3) + [R9]
On entry: R0...R3 = Source FP value R9 = Pointer five-byte valueOn exit, R0...R3 is the result, and R4...R7 are corrupted. Overflow errors are possible. |
&64 | FSUB |
Subtract R0...R3 from value pointed to by R9. Notically: [R9] - (R0...R3)
On entry: R0...R3 = FP value R9 = Pointer five-byte valueOn exit, R0...R3 is the result, and R4...R7 are corrupted. Overflow errors possible. |
&68 | FMULL |
Multiply the four-word FP value in R0...R3 by the variable pointed to by R9. Notically: (R0...R3) * [R9]
On entry: R0...R3 = Source FP value R9 = Pointer five-byte valueOn exit, R0...R3 is the result, and R4...R7 are corrupted. Overflow errors possible. |
&6C | FDIV |
Divide the the variable pointed to by R9 by the four-word FP value in R0...R3. Notically: [R9] / (R0...R3)
On entry: R0...R3 = Source FP value R9 = Pointer five-byte valueOn exit, R0...R3 is the result, and R4...R7 are corrupted. Overflow errors and divide by zero are possible. |
&70 | FLOAT |
Convert an integer to a four-word floating point value.
On entry: R9 = IntegerOn exit, R0...R3 is the floated version, and R9 is &80000000 (float type code). |
&74 | FIX |
Convert a four-word FP value into an integer. On entry: R0...R3 = Floating point valueOn exit, R0 is the fixed version (rounded towards zero), and R9 is &40000000 (integer type code). |
&78 | FSQRT |
Take the square root of the floating point number in R0...R3.
On entry: R0...R3 = Floating point valueOn exit, R0...R3 is the result, and R4...R7 are corrupted. Negative root error possible. |
The floating point values in R0...R3 are given as follows:
R0 = 32 bit mantissa, normalised (so bit 31 = 1) R1 = Exponent in excess-128 form R2 = Undefined R3 = Sign, 0 is positive and &80000000 is negativeThis is informational only, and the developers reserve the right to change the format. You are asked to treat R0...R3 as a single item, without worrying about the constituent parts.
 
Here is an example program which will list all of the tokens recognised by BASIC. It is completely written in assembler, so could be saved as a utility. Note, however, that it must be loaded and executed from within BASIC as the extended environment is only available from BASIC.
Note also, that passing invalid token values replies with junk. You can see this for yourself if you alter the secnd token set to end at a number higher than 183.
REM >listtokens REM REM Lists the tokens recognised by BASIC DIM code% 396 FOR pass = 8 TO 10 STEP 2 P% = code% L% = code% + 396 [ OPT pass .begin STMFD R13!, {R14} MOV R5, R14 ; for token print routine ADR R0, starttitle SWI "OS_Write0" BL firstset SWI "OS_NewLine" BL secondset ADR R0, endtitle SWI "OS_Write0" LDMFD R13!, {PC} .token \ This prints a token using BASIC's internal routine. \ Call with R0 set to the token number. STMFD R13!, {R14} CMP R0, #255 ADRHI R12, tokenbuffer SUBHI R0, R0, #256 STRHI R0, tokenbuffer MOVHI R0, #200 ADR R14, back ADD PC, R5, #&4C .back .tokenloop LDRB R0, [R1], #1 CMP R0, #&7F SWICC "OS_WriteC" BCC tokenloop LDMFD R13!, {PC} .tokenbuffer EQUD 0 .firstset \ These are the first tokens, 127 to 255 (but not 200) STMFD R13!, {R14} MOV R10, #127 .firstloop SWI "OS_WriteS" EQUS "Token "+CHR$0 ALIGN MOV R0, R10 BL print_number SWI "OS_WriteS" EQUS " is "+CHR$0 MOV R0, R10 CMP R0, #200 BLEQ special BLNE token SWI "OS_NewLine" ADD R10, R10, #1 CMP R10, #256 BLT firstloop LDMFD R13!, {PC} .special SWI "OS_WriteS" EQUS "extension token"+CHR$0 MOV PC, R14 .secondset \ These are the first tokens, 127 to 183 STMFD R13!, {R14} MOV R10, #127 .secondloop SWI "OS_WriteS" EQUS "Token 200+"+CHR$0 ALIGN MOV R0, R10 BL print_number SWI "OS_WriteS" EQUS " is "+CHR$0 MOV R0, R10 ADD R0, R0, #256 BL token SWI "OS_NewLine" ADD R10, R10, #1 CMP R10, #184 BLT secondloop LDMFD R13!, {PC} .print_number ADR R1, number_buffer MOV R2, #8 SWI "OS_BinaryToDecimal" ADR R0, number_buffer SWI "OS_Write0" MOV PC, R14 .number_buffer EQUD 0 EQUD 0 .starttitle EQUS "BASIC: tokens and their keywords"+CHR$13+CHR$10 EQUS "--------------------------------"+CHR$13+CHR$10+CHR$13+CHR$10+CHR$0 ALIGN .endtitle EQUS CHR$13+CHR$10+"Finished."+CHR$13+CHR$10+CHR$13+CHR$10+CHR$0 ALIGN ] NEXT CALL begin END