From ARMwiki
Jump to: navigation, search

In the early days of computing, you had a lump of silicon which performed a number of instructions.

As time progressed, more and more facilities were required, so more and more instructions were added. However, according to the 20-80 rule, 20% of the available instructions are likely to be used 80% of the time, with some instructions only used very rarely. Some of these little-used instructions are very complex, so creating them in silicon is an arduous task. Instead, the processor designer uses microcode.

To illustrate this, we should consider a modern CISC processor, such as the Intel Atom® (as in the eeePC and similar). When the processor is asked to execute x86 instructions, it translates them into "micro-ops" which are effectively RISC instructions which are executed on the RISC core.

The CISC design principle was to create an instruction set that, as much as possible in hardware, facilitates programming. High level languages were not common, the majority of coding was in assembler, so the idea was to make the processor able to do more and more.
While this may sound like a bizarre thing to suggest, we arrive at a situation where we can have processors that understand high level concepts, such as arrays and array boundaries.

The problem with CISC processors is that in order to represent this in actual hardware, it is accepted that there will be quirks and limitations. For example, the x86 processors have specific "general purpose" registers (EAX, EBX, ECX, EDX) in various guises (AL/AH, etc) that contain built-in requirements, for example multiply instruction places the result in (E)AX. You don't have a choice.

The RISC principle, on the other hand, takes the alternative approach. It is important to understand that RISC means Reduced Instruction Set Computer, not Reduced Instruction Set Computer. What this means is that the RISC principle is not in reducing the number of available instructions, but rather, in reducing their complexity. Take, for example, if we had the instruction:

  LDXORAD  A, [B], C

which would load a word from the address pointed to by 'B', exclusive OR it with C, and then add it to the contents of A (the result being written to A). This fictional instruction is an example of CISC. RISC, on the other hand, would implement it something like:

  LDR  D, [B]    ; load word pointed to by B into D
  EOR  D, C, D   ; EOR D with C, result stored in D
  ADD  A, A, D   ; Add A and D, result ending up in A

We introduce an additional register, D, so B and C do not need to be corrupted. It is for this reason that RISC processors tend to have a fairly large number of registers, though this is not a rule.

One thing RISC generally does offer, however, is register independence - a factor of the orthogonality which typifies RISC design. The ARM defines at minimum R15 as the program counter, and R14 as the link register (although, after saving the contents of R14 you can use this register as you wish). R0 to R13 can be used in any way you choose, although the Operating System usually defines R13 to be used as a stack pointer. There are other conventions applied to most code, for example APCS (Arm Procedure Call Standard) applies firmer rules and assigns more functions to registers (such as Stack Limit, Frame Pointer, etc). However, none of these - with the exception of R15 and sometimes R14 - is a constraint applied by the processor. You do not need to worry about saving your accumulator in long instructions, you simply make good use of the available registers.

The x86's offers registers, A, B, C, D which are not just "starting from A" but have meanings - Accumulator, Base, Count, and Data - which give a clue as to their intended functions, as to the segment registers (in a 16 bit world) for Code, Data, Extra, and Stack; then SI and DI which are indexing registers. The x86 is somewhat less flexible than the ARM in its register use. That, coupled with fewer registers...

The RISC principle is, therefore, to reduce the instruction set to building blocks, as we know that any complex instruction can be built from a sequence of simpler instructions. But not only that, we know also that in each case we can customise our code for the best possible performance, rather than following the quirks of a complicated instruction set. This is the other aspect of orthogonality, where pretty much every instruction can work with every register and in every addressing mode.

RISC suffers two problems. If you place identically clocked RISC and CISC processors side by side, the CISC will always win. This is because the CISC processor is able to perform "more" per instruction than the RISC, and even multi-cycle instructions are liable to execute more quickly than the equivalent series of RISC instructions (because RISC cannot avoid the requirement of fetching and decoding the instruction). In addition, code density is less so an application in RISC is likely to consume more memory than the equivalent in CISC; however the ARM has a few tricks up its sleeve to reverse this assumption. The ARM lends itself to clean, concise code which is simpler to work with and understand.

These RISC problems are far outweighed by the benefits of RISC. RISC is often a lot friendlier to program. You have to think at a lower level exactly what you want to achieve, but once you have done so, the processor aids, not hinders. A loop within a loop within a loop is a simple thing to achieve. Doing likewise on a CISC may involve PUSH and POP of the loop counter register, or some other heinous method of shifting things around to please the processor. Then we get into silicon complexity. RISC processors tend to be quite simple designs - indeed the ARM is not microcoded, it is bare silicon all the way. This means it is cheaper to produce and can be more efficient. The ARM processors used in the majority of Android mobile phones can extract a healthy life from a tiny battery, and even when playing video they can remain cool within a hermetically sealed container (such as the Motorola DEFY which is, to a degree, waterproof!). Compare the x86 family, which are physically larger, consume rather more power, and create sufficient heat that even an Atom-powered netbook requires a fan running on tick-over to keep the machine cool.

Arguably RISC has won the war, for CISC processors use microcoded RISC instructions internally. However in the world as we see it, most processors have a place - the 2GHz+ multi-core x86 kit for heavy number crunching, the long-data-word ST20 processors which lend themselves to digital video applications, the Atmels and the PICs for itty-bitty embedded designs, and the ARM carving itself a niche in the ultra-low-power market. ARM might not be much of a contender in a desktop machine, however if you compare the number of PCs (x86) with the number of PlayStations (MIPS) and the number of Apple/Android mobiles (ARM), you'll see they're very different markets with very different requirements.

Personal tools