Introduction to
ASSEMBLER

 

26th September 2000:
In the first release of this assembler programming tutorial, I missed out on a vital piece of information. Why the hell would you want to do this in the first place?
That, I cannot answer. Everybody has their reasons. Maybe your boss wants you to make something that little bit faster and more efficient. Maybe you have been given the source to a program that you often use, and it is up to you to update it. Maybe you are just doing this 'cos you think it might be a worthwhile hobby.

One thing that is important is the psychology of it all. Here, in this section, I have examined my motives, my feelings. I'd never really thought about it before, but it makes for interesting reading.
I won't repeat it all here, but the two principal documents you should have a look at are:

Now, on with our scheduled documentation.....

 

 

If you want to knock up a quick little program, then you can't pick much better than BASIC.

   PRINT "HeyRick!"

If you want to write a less-hackable program with large speed benefits, you should pick C or C++. For speed concerns, the old CastAVote vote editor (written in BASIC) took around 20 to 30 seconds to delete a vote at the beginning of a full file. The new vote editor tasks this out to VoteModule (written in C). It does it in a second.

   printf("HeyRick\n");
If you want the ultimate speed and flexibility with the ability to perform low-level hackery, you should be looking to assembler.
On the left, BASIC's assembler (we'll cover this a lot); on the right, the same thing written for Acorn's objasm assembler...
                                         GET   h.SWIs
                                         AREA  |asm$$code|, CODE, READONLY
   ADR   R0, text                        ENTRY
   SWI   "OS_PrettyPrint"              
   SWI   "OS_NewLine"                    ADR   r0, text
   MOV   PC, R14                         SWI   OS_PrettyPrint
                                         SWI   OS_NewLine
   .text                                 MOV   pc, r14
   EQUS  "HeyRick"
   EQUB  0                         text  DCB   "HeyRick", 0
   ALIGN                                 ALIGN
As you can see, assembler is much more involved. There are quicker ways to write the above, but it is a reasonable example.

You may know that it is possible to call BASIC functions in assembler, provided you are using the BASIC assembler to compile your code:

   MOV   R0, #RND(64)
But if you plan to write a fully-assembler application, you'll need to know how to write the RND code.

Lets take that example and expand it into a fully fledged application...

REM Assembler demo #1
:
DIM code% 64
FOR pass%=0 TO 2 STEP 2
P%=code%
[ OPT pass%

    ADR    R0, text
    SWI    "OS_PrettyPrint"
    SWI    "OS_NewLine"
    MOV    PC, R14

  .text
    EQUS   "HeyRick"
    EQUB   0
    ALIGN
]
NEXT

CALL code%
Download this example

Let's work through the program. The first line is "DIM code% 64". This reserves enough memory for your program. BASIC ensures that your memory begins on a word boundary.

The next line is "FOR pass%=0 TO 2 STEP 2". This is important, as the first time through, the assembler cannot resolve all the references. Therefore, in conjunction with the OPT statement, the code is actually passed twice.
The first time through, all errors are ignored. Then, once all references should have been set up, the code is passed again and references can be recognised.
Read opt.html for details of the available OPTions.
If you want to see the assembly taking place, amend the line to say:

   FOR pass%=1 TO 3 STEP 2

The third line, "P%=code%" tells the assembler where to compile the code. The variable P% is assumed to be the pointer to the code - so you must set it to point to the start of the memory block before each pass.
Do not forget to set P% if you are cobbling together some test code in, say, a TaskWindow. Because P% is likely to be initialised to zero, and trying to compile code over the hardware vectors is not good for the health of your data, or you...
If you are using offset assembly, P% is set to zero and the pointer to memory block is placed in O% instead. This is demonstrated later on.

"[ OPT pass%" is an important line. The opening square bracket denotes the following code are assembler instructions. The "OPT" then specifies which compilation options are to be used this time in (refer to opt.html for details of the options available).

"ADR" is not a real instruction. What it does is place the address of the specified value into the given register. In this line ("ADR R0, text") it places the address of text into register zero.
Read this for more on ADR.

Next, two SWIs are called. This is similar to BASIC's SYS command. Firstly "SWI "OS_PrettyPrint"" prints the text, and automatically wraps it to fit nicely in the available space. Secondly, "SWI "OS_NewLine"" prints a newline character. Unlike BASIC, this is not implicit.

Many registers exist, and some of them have special functions. This is detailed in regs.html. However I shall tell you here that register 15 (also known as "PC") is the program counter. When a program is started, or when a branch with link occurs, the return address is stored in register 14 - the link register.
Therefore is becomes easier to understand the line "MOV PC, R14". It places the value of R14 (currently holding the return address back to BASIC) into the program counter. Hey presto, we are back!

Lastly the block ".frobnicate_text" defines a zero terminated string and aligns the tail end so it is on a word boundary. The definition is marked by a label following a period. The data follows, and is a regular instruction - thus it can be either active code, or simply statements to load data into the current memory location.

The closing square bracket marks the end of an assembler section. Unlike in BASIC, a closing statement does not mean the end of the routine. You must explicitly return before closing up, as shown in this example.

The "NEXT" matches the "FOR" (above).

Finally, we "CALL code%", or in other words we branch to the address of the variable code% and begin execution there - hence we execute the assembler section.
We DO NOT call P% as it is incremented as each instruction is compiled. Only code% points to the start of the assembler code.

Try it.

 

 

Another thing to note is that assembler is a "compiled language" similar to C, but in the loosest sense of the concept. You cannot usually type assembler at the BASIC prompt and get it to run. Programs are written by the following steps:

  1. Program coded in !Edit, or some other kind of editor.
  2. Assembler is compiled into a machine code file.
  3. Machine code file can then be executed.
The sheer power of BASIC's assembler means that you can insert small sections of code into a program which are compiled at program initialisation, and can then be used in the program. The BASIC language can also be used as a rather powerful macro assembler.

There is another way of writing assembler - the APCS specification. This method is used to link assembler with other high-level compiled languages (such as C or Pascal), or to take advantage of Acorn's Desktop Development Environment. You can read about it here, but you are advised to become familiar with assembler before venturing on to APCS.

 

 

The last thing to say is the notation that is used in this section. It will be familiar to BASIC coders, but may seem confusing to others...

  &    Denotes a hexadecimal number.
       For example &16F is the number 367 in hex.

       Other ways of denoting hexadecimal are:
         $16F   0x16F   16Fh   H16F


  >>   Binary shift right, has the effect of dividing
       a number by the shift amount.

       Thus:  12 >> 1 equals 6
        and: 128 >> 3 equals 16.

       Work this out in binary if you are unsure.


  <<   Binary shift left, has the effect of multiplying
       a number by the shift amount.

       Thus: 12 << 1 equals 24
        and:  3 << 8 equals 768


  %    Denotes a binary number.
       For example %11010010 is 210 represented in binary.
If you have not come across binary before, you should at least know that a computer represents data (text, pictures, etc) as a series of bytes. These bytes themselves are comprised of eight bits. These bits reflect the ON/OFF patterns used by the computer. Binary is a way of representing these patterns is a readable form.
When you get the hang of binary, it is really quite simple. Binary reads from the rightmost 'bit' (called bit zero, NOT bit one) to the leftmost. The rightmost bit is equal to '1'. The second bit is '2', and so on multiplying each time. For example:
  %1    1    1    0    0    1    1    1    1    1    1    1

   2048 1024 512  256  128  64   32   16   8    4    2    1
The upper row is an example binary number. The lower row shows the values associated with each bit. To calculate value of the binary number, simply add it all up! As you can see, bits seven and eight are zero, so skip them when adding...
1 + 2 + 4 + 8 + 16 + 32 + 64 + 512 + 1024 + 2048 = 3711

Finally, here's a quick tip for converting binary to hexadecimal.
People are not good at remembering large number sequences (well, except Carol Vorderman) and trying to get %1011111011101111000100000110 right will prove difficult, especially if you are trying to remember other stuff at the same time.
You can, however, convert between binary and hex extremely easily, and you only need to count up to 15 in binary.

Chop the number into groups of four, remember to begin on
the right hand side:

        1011 1110 1110 1111 0001 0000 0110

Then convert each group of four into a single denary number:

        1011 1110 1110 1111 0001 0000 0110

        =11  =14  =14  =15  =1   =0   =6

And convert that denary number into hex:

        1011 1110 1110 1111 0001 0000 0110

        11   14   14   15   1    0    6

        =B   E    E    F    1    0    6

Which tells us that %1011111011101111000100000110 can be
better remembered as &BEEF106.

This process works in reverse too.

Unfortunately you cannot stop at the denary version as there
is no direct co-relation between base 2 or base 16 and base
10. In case you are interested, that number is 200208646.
:-)

Return to assembler index
Copyright © 2004 Richard Murray