overview components memorymap downloads possibilities contact
 
construction pinouts 6502asm amelieem possibilities
The Amélie project
Information - 6502asm

I know it is extremely geeky, but since I didn't like the lack of Acornisms in the existing 6502 assemblers, I wrote my own!

 

Introduction

6502asm is a 6502 assembler designed for Amélie.
While modern assemblers contain all sorts of complicated lexical analysis code, hashing tables, and the kinds of things that make the program capable but a nightmare to figure out how it works...
...6502asm takes the exact opposite approach. It has been designed to be internally simple.

It has been designed to be simple to write, simple to use, and portable. At time of writing it is only available for MS-DOS, but it is expected that a RISC OS version will be along soon.

 

The assembly process

6502asm is a two-pass assembler. It will start at the beginning of the first part of the file passed, and it will carry right through to the end, ignoring "label not found" errors but recording the addresses of labels. It will then start again, at the beginning, and reassemble.

This process is liable to come unstuck if you define page zero locations after the code - as it will have expected to assemble with non-page-zero instructions, so all subsequent offsets will be a byte off. Currently 6502asm doesn't warn you of this.

Within 6502asm, you are setting up labels and assembling into a flat 64K block of memory - this represents the addressing range of the 6502 processor. By using the commands BOT and TOP, you can control which part of this memory you actually want to save (by default BOT = &0000 and TOP = last instruction assembled).

 

How the line is parsed

With the exception of quotes, the line is 'compressed'. This means that every space is removed, so:

LDX pickup, X    ; read this pickup point with X indirect

becomes:

ldxpickup,x;readthispickuppointwithxindirect

The line is also made lowercase to simplify instruction matching. The original (unmodified) version is available for cases where string literals are to be read.

 

Of directives and instructions

An assembler directive is a command that directs the assembler to do something. On the whole, directives look like instructions. This is intentional - after all, pretty much every assembler recognises the org directive to set the current assembly address, you rarely have to do weird things with directives, such as "{org}" and the like.

6502asm has a rich set of directives, most of which are three characters in length. These are matched in "command_parse()" within "command.c ".

  • BOT <addr> [FORCE]
    Sets the lowest address to be saved. The 'force' option is used if raising the bottom address above a known address; i.e. you want to save the final 4K of the memory area but you have defined zero pages labels...you would BOT &F000 FORCE.
     
  • CNT <filename>
    Continue assembling from the file given.
     
  • DCB / DCW / DCS / DCZ
    Insert a byte, 16bit word, string literal, or zero-terminated string literal.
     
  • EQUx
    Alternative form of DCx.
     
  • FIL <count>, <bytes>
    Insert <count> repetitions of <byte>.
     
  • INF <file>
    Insert the entire contents of the named file.
     
  • ORG <address>
    Set the assembly address.
     
  • ROM
    This command was inserted for use within Amélie, it will clear the entire memory map to &FF and then:
    • BOT  &F000
    • TOP  &FFFF
    • ORG  &FFFA
    • DCW  nmi_vector
    • DCW  reset_vector
    • DCW  irq_vector
    • BOT  &F000
    • TOP  &FFFF
    • ORG  &F000
    You must provide the three vector labels in your code.
     
  • TOP <address>
    This sets the upper address. Normally this is automatically set from the last instruction assembled.

Anything not recognised at this point is assumed to be an instruction, and is looked up in the instruction table.

 

Addressing modes

6502asm supports the NMOS addressing modes, detected as follows (in "command_findaddrmode()" within "command.c "):

    • Implied
      <instruction>
       
    • Accumulator
      <instruction> A
       
    • Immediate
      <instruction> #<value>
       
    • Absolute or zero page (if value less than 256)
      <instruction> <value>
       
    • Absolute/zero page + X
      <instruction> <value>, x
       
    • Absolute/zero page + Y
      <instruction> <value>, y
       
    • Indexed X (pre-indexed indirect)
      <instruction> (<value>, x)
       
    • Indexed Y (post-indexed indirect)
      <instruction> (<value>), y

Things to note are that accumulator addressing must have an 'A' register specified. Many assemblers will 'assume' base upon the instruction, 6502asm will not. Also, there is no detection for relative addressing as this depends upon the instruction (all branch instructions use relative addressing).

Just to re-iterate, 6502asm does not support the additional 65C02 or plentiful 65CE02 addressing modes.

 

The instruction table

The instruction table is opcode.dat and it looks like:

;     AccImmAbsZpaZpxZpyAbxAbyImpRelInxInyAbiEmu
inc,0,--,--,EE,E6,F6,--,FE,--,--,--,--,--,--,--
inx,0,--,--,--,--,--,--,--,--,E8,--,--,--,--,--
iny,0,--,--,--,--,--,--,--,--,C8,--,--,--,--,--
jmp,0,--,--,4C,--,--,--,--,--,--,--,--,--,6C,--
jsr,0,--,--,20,--,--,--,--,--,--,--,--,--,--,--
lda,0,--,A9,AD,A5,B5,--,BD,B9,--,--,A1,B1,--,--
ldx,0,--,A2,AE,A6,--,B6,--,BE,--,--,--,--,--,--
ldy,0,--,A0,AC,A4,B4,--,BC,--,--,--,--,--,--,--

On the left, the instruction. Then follows a CPU type ID, which is allocated as follows:

    1. All versions of the 6502
    2. NMOS 6502
    3. CMOS 65C02
    4. CMOS extended 65CE02
    5. Other

Following this, the instruction codes per addressing mode, or "--" if invalid. From the snippet above, you can see that LDA #4 (the immediate mode) is instruction &A9.

Some things you could add to the table:

  • Copy ADC and SBC and call them ADD and SUB respectively.
     
  • The processor status register is referred to as 'P' in one place (PHP, PLP) and 'S' in another (TSX, TXS). You could fix this as you see fit.
     
  • You could alias extra branch instructions to reflect the flag behaviour - BEQ and BNE should also be BZS and BZC (akin to BCC and BCS).
     
  • Provided no fruity addressing modes are required, you could add some 65C02 instructions, such as...
    • BBR (Branch on Bit Reset)
    • BBS (Branch on Bet Set) - this could be aliased as BBC?
    • BRA (BRanch Always)
    • PHX (PusH X onto stack)
    • PHY (PusH Y onto stack)
    • PLX (PulL X from stack)
    • PLY (PulL Y from stack)
    • RMB (Reset Memory Bit)

 

Just show me the code!

The code is written in plain C, with C style comments. It has been written to be completely portable.

6502asm compiles on these systems:

    • 16-bit DOS (all versions of MS-DOS) = TurboC++ v1.0
      The project files supplied are for use with TurboC++, which is downloadable from Borland (look for the museum). Because the assembly memory array is bigger than a segment, you'll need to use the huge memory model.
       
    • 32-bit DOS (DOS shell under NT, W95/98/ME/2000/XP) = OpenWatcom v1.2
      Project files are not available. Take the C and H files, and create a new project (I called mine 6502asm32), set release options (i.e. no debug, best optimisations, flat memory model) and hit MAKE ALL.
       
    • RISC OS = any Norcroft/Acorn compiler
      MakeFile is not currently available. Take the C and H files and create a new MakeFile for them. I recommend the use of the latest compiler for proper 26/32bit neutrality, but if you have an older compiler (i.e. v4.00) then I don't see any reason why it won't work on older machines, or the Iyonix with Aemulor.
       
    • Other compilers
      This should compile under other compilers: gcc, lcc, EasyC, etc. You're on your own though...

If you need any help with compiling 6502asm, please read the documentation that came with your compiler.

If you need any help with 6502asm's code, feel free to contact me.

 

Modules

command.c is the main core of the assembler, and where most of the work happens.

hash.c is where the '#' commands are handled. These are special commands for things like conditional compilation and the like (much of it not yet written!).

label.c is the code that handles allocating memory for labels, storing labels and addresses, and looking up labels.

line.c is code to read a line from file, and remove spaces. It copes with RISC OS (&0A terminated) lines and MS-DOS (&0D/&0A terminated) lines. You will need to modify this if your system uses a different method.

memarea.c allocates a 64K block of memory. On useful systems, memory is flat and you can just call calloc() and free(), however DOS (via TurboC++) is segmented so the __MSDOS__ conditional code will call a 'far' version to achieve the same thing. If your system has unusual memory allocation ideas, this is where to sort it out. The rest of 6502asm expects it to just be a character array.

wrapper.c is the entry point. It organises initialisation, opens the files. It then reads in each line and, depending on the first character, will call the appropriate handler.

 

Global variables and structures

I will not describe these here - the 6502asm.h file is fully commented.

 

Release notes

6502asm is no yet 'finished', nor has it really been tested, so I have nothing to add at this time.

© 2005 Rick Murray