It is pretty standard that most programmers when testing a build environment write a little program to spit "
Hello World!" to the display. There is no real requirement for originality here as we don't really care much for what is written, only that it actually works. Because it is utterly pointless to try any sort of coding when you don't have a way to make the programs become.
The prerequisites are the OSD sources and something to run them on (I use Portable Ubuntu - details here):
That is to the system can find the "arm-linux-<blah>" tools and their libraries/resources.
Anyway, I wrote the following:
And compiled it with:
arm-linux-gcc -o helloworld helloworld.c
And I was rewarded with a 7,154 byte program.
You WHAT? A little shy of 8K just to write 13 characters to the screen? You're kidding, right? I know there will be some overheads for file headers and library code, but come on. I once made a complete GIF decoder in nearly a quarter of that size.
The next optimisation is to realise that there is a lot of junk put into the file that isn't necessary. Luckily there's a tool to cleanse the file:
Now we're looking at a file 3,072 bytes in size. To print a 13 character string.
So, being the sort of boring geek that I am, I decided to do something about this.
The first step was to brush up on the ELF file format (you'll find the spec at http://refspecs.freestandards.org/elf/elf.pdf), and how one calls system functions under Linux (you'll need to Google around and wade through lots of INT 0x80 trivia).
This is important because - sorry guys - we're pretty much going to have to drop C in the bin. Or /dev/null if you prefer.
So let me introduce, to you, the all new revised helloworld_tiny.s program.
The first part is the ELF header:
@ Write a basic ELF header [ALL words are written backwards!]
.word 0x464C457F @ ELF "magic" value
.word 0x61010101 @ Type = 32 bit, word order LSB, ver 1
.word 0 @ padding
.word 0 @ padding
.word 0x00280002 @ executable file, ARM CPU
.word 0x00000001 @ version = 1 (current)
.word 0x00008068 @ entry point, start of execution
.word 0x00000034 @ program header table offset
.word 0 @ section table offset (there is none)
.word 0x00000002 @ processor specific flags (2=???)
.word 0x00200034 @ ELF header size, size of ptab entry
.word 0x00000001 @ num of ptab ents, size of sectab ents
.word 0 @ num sectab ents, ptr to string table
This defines a fairly simple header to say we're for a 32 bit ARM system, there's one program table entry to define what we'll be loading. Oh, and our entry point is &8068.
Some of the "unknown" values (processor-specific flags, bytes that should be zero but aren't) have just been copied from the headers of other files.
So obviously, the next step is the program header table, which is:
@ Now for a basic program header table
.word 0x00000001 @ type = PT_LOAD (loadable)
.word 0 @ offset (0 = load from start)
.word 0x00008000 @ virtual address to load to
.word 0x00008000 @ physical address to load to
.word 0x0000007F @ number of bytes to load
.word 0x0000007F @ size of memory image
.word 0x00000005 @ flags = Executable (1) and Read (4)
.word 0x00008000 @ alignment
This is it for the headers. Note that the ELF specification (v1.2) says that a section table (that .text, .bss stuff) is optional. Likewise, the PT_PHDR definition in the program header table does not have to be present.
So the final part of the equation is some actual code:
@ Now for some really simple code to print the message to the terminal.
.ascii "Hello World! :-)\n"
mov r0, #1 @ 1 = stdout
adr r1, message @ pointer to message
mov r2, #17 @ message length
swi 0x900004 @ swi call for Sys_Write
mov r0, #0 @ set return code
swi 0x900001 @ swi call for Sys_Exit
@ That's it! Done.
This sets up, basically, a call to the system "write" function to write the 17 bytes of the message to the console (the stdout pseudo-file). Then we call the system "exit" function with a return code of zero.
This assembles with:
arm-linux-gcc -o helloworld_tiny helloworld_tiny.s
But we are not done yet.
If you examine the file, you'll notice it is quite big still. Because:
pubuntu@pubuntu:~/osd/our-apps/helloworld$ hd helloworld_tiny
00000000 7f 45 4c 46 01 01 01 61 00 00 00 00 00 00 00 00 |.ELF...a........|
00000010 01 00 28 00 01 00 00 00 00 00 00 00 00 00 00 00 |..(.............|
00000020 e0 00 00 00 00 00 00 00 34 00 00 00 00 00 28 00 |........4.....(.|
00000030 07 00 04 00 7f 45 4c 46 01 01 01 61 00 00 00 00 |.....ELF...a....|
00000040 00 00 00 00 02 00 28 00 01 00 00 00 68 80 00 00 |......(.....h...|
00000050 34 00 00 00 00 00 00 00 02 00 00 00 34 00 20 00 |4...........4. .|
00000060 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000070 00 80 00 00 00 80 00 00 80 00 00 00 80 00 00 00 |................|
00000080 05 00 00 00 00 80 00 00 48 65 6c 6c 6f 20 57 6f |........Hello Wo|
00000090 72 6c 64 21 20 3a 2d 29 0a 00 00 00 01 00 a0 e3 |rld! :-)........|
000000a0 20 10 1f e5 0d 20 a0 e3 04 00 90 ef 00 00 a0 e3 | .... ..........|
000000b0 01 00 90 ef 00 2e 73 79 6d 74 61 62 00 2e 73 74 |......symtab..st|
000000c0 72 74 61 62 00 2e 73 68 73 74 72 74 61 62 00 2e |rtab..shstrtab..|
000000d0 74 65 78 74 00 2e 64 61 74 61 00 2e 62 73 73 00 |text..data..bss.|
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
What the hell? There's two ELF headers?
Yes. That is correct. Sadly the GNU assembler does not appear to have an option to spit out a flat binary file. It is a marked-up ELF for passing to the linker. I'm not actually sure if it is physically possible to get a pure binary file out of the GNU tools.
At this point I cheat. A lot. I copy the file back to Windows, and use Hexplorer to simply rip out the bits that are not wanted. As we are working in pure assembly here, we know exactly what our file should look like. And it should look like this:
Now for the moment of truth. Over to the OSD:
~ $ wget http://azumi/helloworld_tiny
--2011-04-03 23:59:57-- http://azumi/helloworld_tiny
Connecting to 192.168.0.10:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 128 [application/octet-stream]
Saving to: `helloworld_tiny'
100%[======================================>] 128 --.-K/s in 0.001s
Last-modified header invalid -- time-stamp ignored.
2011-04-03 23:59:57 (213 KB/s) - `helloworld_tiny' saved [128/128]
~ $ chmod 755 helloworld_tiny
~ $ ./helloworld_tiny
Hello World! :-)
And there, ladies and gentlemen, you have it. The smallest legal program to display a Hello World message on ARM Linux, a mere 128 bytes. The stripped C version being 24 times larger.
There are more optimisations possible, no smiley in the message to save a word, and I could start abusing the ELF headers. But I'm looking for the smallest valid program, not the smallest nightmare. ☺
There is a change, I should mention, where the later EABI kernels tidy up the messy SWI dispatch mechanism by placing the call in R7 and calling SWI 0. But the OSD doesn't run such a kernel (unless you are called Gerry Boland!), and the later kernels are backwardly compatible.
Right now I have a massive smile on my face. Not just because I got this working, but because I have been able to write some ARM code (all <cough>six</cough> lines of it!). There is just something karmically pleasing about ARM code.
Downloads (right click and 'Save link as...' to download)
Okay, it's time to relax to some nice music. My two suggestions for today:
Please note that while I check this page every so often, I am not able to control what users write; therefore I disclaim all liability for unpleasant and/or infringing and/or defamatory material. Undesired content will be removed as soon as it is noticed. By leaving a comment, you agree not to post material that is illegal or in bad taste, and you should be aware that the time and your IP address are both recorded, should it be necessary to find out who you are. Oh, and don't bother trying to inline HTML. I'm not that stupid! ☺ ADDING COMMENTS DOES NOT WORK IF READING TRANSLATED VERSIONS.
You can now follow comment additions with the comment RSS feed. This is distinct from the b.log RSS feed, so you can subscribe to one or both as you wish.
|joe, 4th April 2011, 15:07|
I cannot comment on this problem, because I don't have enough ARM assembler knowledge, I've just started on this journey, I would appreciate some pointers, how to know, which bits you can delete from the binary file using hexeditor and where to find the opcodes and reverse engineering tips.
(Felicity? Marte? Find out!)
List all b.log entries
Return to the site index
PS: Don't try to be clever.
It's a simple substring match.
Last read at 04:21 on 2021/01/22.
© 2011 Rick Murray
This web page is licenced for your personal, private, non-commercial use only. No automated processing by advertising systems is permitted.
RIPA notice: No consent is given for interception of page transmission.