Processor setup via co-processor 15
|
The setup is controlled by co-processor 15 registers, accessed with MRC and MCR in non-user mode.
These registers are particular to the processor specified.
Bits 0 - 7 Revision of processor Bits 8 - 15 Should be '3', identifying processor as an ARM3 Bits 16 - 23 Manufacturer code (&56 = VLSI Technology Inc.) Bits 24 - 31 Designer code (&41 = ARM Ltd)
Bit 0 - Turns the cache on (1) or off (0) Bit 1 - Determines if user mode and non-user mode use the same address mapping. 1 if they do, or 0. Should be 1 for use with MEMC. Bit 2 - 0 for normal operation, 1 for special monitor mode (processor runs at memory speed and address/data always put on external pins even if data fetched from cache - for logic analyser to trace the program properly). Other bits reserved.
Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are cachable, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are cachable, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are cachable, 0 if not
Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are updateable, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are updateable, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are updateable, 0 if not
Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are disruptive, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are disruptive, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are disruptive, 0 if not
Bits 0 - 7 Revision of processor (&1x) Bits 8 - 15 Processor identity Bits 16 - 23 Manufacturer code (&56 = VLSI Technology Inc.) Bits 24 - 31 Designer code (&41 = ARM Ltd)
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Instruction/data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 4 - 26 bit program space if 0, 32 bit program space if 1 Bit 5 - 26 bit data space if 0, 32 bit data space if 1 Bit 6 - Early abort mode if 0, late abort mode if 1 Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the ARM610 permission system
00 No Access - Domain fault generated if tried to access 01 Client - Accesses are checked against permission bits in section/page descriptor 10 Reserved - Currently behaves like no access mode 11 Manager - Accesses are NOT checked, permission faults cannot be generated
Bits 0 - 3 Status Bits 4 - 7 Domain Bits 8 - 11 Set to zero Bits 12 - 31 Whatever was the last value on the internal data bus
Bits 0 - 3 Revision of processor? Bits 3 - 15 Processor identity - &710 Bits 16 - 23 Manufacturer code Bits 24 - 31 Designer code (&41 = ARM Ltd)
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Instruction/data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 4 - 26 bit program space if 0, 32 bit program space if 1 Bit 5 - 26 bit data space if 0, 32 bit data space if 1 Bit 6 - Early abort mode if 0, late abort mode if 1 Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the ARM710 permission system Bit 9 - ROM bit - controls the ARM710 permission system
00 No Access - Domain fault generated if tried to access 01 Client - Accesses are checked against permission bits in section/page descriptor 10 Reserved - Currently behaves like no access mode 11 Manager - Accesses are NOT checked, permission faults cannot be generated
Bits 0 - 3 Status Bits 4 - 7 Domain Bits 8 - 11 Set to zero Bits 12 - 31 Whatever was the last value on the internal data bus
&41077100
.
Bits 0 - 3 Processor revision number
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the MMU permission system Bit 9 - ROM bit - controls the MMU permission system Bit 12 - Instruction cache turned off (0) or on (1)
Bits 0 - 3 Status Bits 4 - 7 Domain Bit 8 Zero Bits 9 - 31 Undefined on read, ignored on write
The OPC_2 and CRm co-processor fields select which cache operation should occur: Function OPC_2 CRm Data Flush I + D %0000 %0111 - Flush I %0000 %0101 - Flush D %0000 %0110 - Flush D single %0001 %0110 Virtual address Clean D entry %0001 %1010 Virtual address Drain write buf. %0100 %1010 -
The OPC_2 and CRm co-processor fields select which cache operation should occur: Function OPC_2 CRm Data Flush I + D %0000 %0111 - Flush I %0000 %0101 - Flush D %0000 %0110 - Flush D single %0001 %0110 Virtual address
The OPC_2 and CRm co-processor fields select the following... Function OPC_2 CRm Enable odd word %0001 %0001 loading of Icache LFSR Enable even word %0001 %0010 loading of Icache LFSR Clear Icache %0001 %0100 LFSR Move LFSR to %0001 %1000 R14,Abort Enable clock %0010 %0001 switching Disable clock %0010 %0010 switching Disable nMCLK %0010 %0100 output Wait for %0010 %1000 interrupt
10 DIM code% 32 20 P% = code% 30 [ OPT 3 40 SWI "OS_EnterOS" 50 MRC CP15, 0, R0, C0, C0 60 TSTP PC, #&F0000000 70 MOV R0, R0 80 MOV PC, R14 90 ] 100 PRINT ~USR(code%)When run, this would print:
>RUN 00008FAC OPT 3 00008FAC EF000016 SWI "OS_EnterOS" 00008FB0 EE100F10 MRC CP15, 0, R0, C0, C0 00008FB4 E31FF20F TSTP PC, #&F0000000 00008FB8 E1A00000 MOV R0, R0 00008FBC E1A0F00E MOV PC, R14 41077100 >Note that this code must run in a privileged mode.
Here is a short exercise for you:
10 DIM code% 16 20 P% = code% 30 [ OPT 3 40 CDP CP1, 0, C0, C1, C2, 0 50 ADFS F0, F1, F3 60 MOV PC, R14 70 ] >RUN 00008F78 OPT 3 00008F78 EE010102 CDP CP1, 0, C0, C1, C2 00008F7C EE010102 ADFS F0, F1, F2 00008F80 E1A0F00E MOV PC, R14 >What do you notice? :-)
When the ARM executes a co-processor instruction, or an undefined instruction, it will offer it
to any co-processors which may be presently attached. If hardware is available to process the
given instruction, then it is expected to do so. If it is busy at the time the instruction is
offered, the ARM will wait for it.
If there is no co-processor capable of executing the instruction, the ARM will take its
undefined instruction trap, in which case the following will happen:
To return, simply pull the saved PC and PSR (depends on 26/32 bit) and push them to the current
PC and PSR, like MOVS PC, R14
in 26 bit systems. This will pick up with the
instruction following the one which caused the trap.
All of the co-processor instructions can be executed conditionally. Please note that the
conditionals relate to the status of the ARM processor, and not the status of any of the
co-processors. This is because the ARM always tries the instruction first, and offers it around
and maybe takes the undefined application trap, so the conditions are ARM related.
To make this clearer:
10 DIM code% 32 20 P% = code% 30 [ OPT 3 40 FLTS F0, R0 50 FLTS F1, R1 60 FMLS F2, F0, F1 70 FIX R0, F2 80 MOVS PC, R14 90 ] 100 INPUT "First number : "A% 110 INPUT "Second number: "B% 120 PRINT USR(code%)This probably won't assemble without an enhanced BASIC assembler.
Anyway, you might think the ARM will hand over to the floating point co-processor to do the four
FP instructions, then hand back afterwards.
If you did, you would be incorrect!
What actually is executed is:
MCR CP1, 0, R0, C0, C0 MCR CP1, 0, R1, C1, C0 CDP CP1, 9, C2, C0, C1 MRC CP1, 0, R0, C0, C2
It is worth pointing out that objasm specifies co-processor registers using the CR
notation (ie, CR0 - CR15
), which is first defined with the CN directive. It does not
appear as if default co-processor instructions are defined in Nick Roberts' ASM, though I've only
looked in the instructions at the "defined symbols" section...
Darren Salt's ExtBASICasm provides the register names C0 - C15
to refer to the
co-processors. So if any of these examples fail when you try to assemble them, please check what
format your assembler provides these instructions.
MRC
transfers a co-processor register to an ARM register. It takes
the form:
MRC <co-pro>, <op>, <ARM reg>, <co-pro reg>, <co-pro reg2>, <op2>The co-processor is denoted in most assemblers by
CPx
.<co-pro reg>
is written to <ARM reg>
, using
operation <op>
. This may, possibly, be further modified by
<co-pro reg2>
and <op2>
. For an idea of the sorts of times
when this might be necessary, consider instructions of the form LDR Ra, [Rb], #x
.
<op2>
may be omitted, as it is in the example, but the other parts
of the MRC instruction must be supplied.
MCR
transfers an ARM register to a co-processor register. It takes
the form:
MCR <co-pro>, <op>, <ARM reg>, <co-pro reg>, <co-pro reg2>, <op2>The co-processor is free to interpret the fields as it desires, but the standard interpretation is that the contents of the ARM register are written to the co-processor register using the operation code given, which may be further modified by the second co-processor register and/or the second operation code.
LDC
loads data from memory into the co-processor register, while
STC
saves data from a co-processor register to memory.LDC <co-pro>, <co-pro reg>, <address> LDCL <co-pro>, <co-pro reg>, <address> STC <co-pro>, <co-pro reg>, <address> STCL <co-pro>, <co-pro reg>, <address>If the 'L' flag is specified, a long transfer is performed. Otherwise a short transfer is performed. The 'L' flag follows the extension, like
LDCEQL
.[Rx] [Rx, #x] ! [Rx], #xThese are like those used for the LDR instruction. However they are only eight bits wide and specify word offsets (the ARM types are 12 bit and byte offset).
STR CP0, CR1, [R2, #16]!
.
CDP
instructs the co-processor to do some processing. It takes the
form:
CDP <co-pro>, <co-pro reg1>, <co-pro reg2>, <co-pro reg3>, <op>This tells the co-processor to do something. The ARM will not wait for it to finish, nor is any sort of status sent back to the ARM. It is possible for a co-processor to maintain a queue of instructions, allowing it and the ARM to process in parallel.
FLTE F0, R0 FLTE F1, R1 MUFE F2, F0, F1 FIX R0, F2 MOV R1, #0you could save a small amount of time with:
FLTE F0, R0 FLTE F1, R1 MUFE F2, F0, F1 MOV R1, #0 FIX R0, F2as the FPU could be finishing the MUF while you MOV. The hardware FPU (as in the 7500FE) runs asynchronous - you can switch to synchronous by setting a bit in the FPSR. The software emulation always runs synchronously, and as it uses the ARM in order to emulate the FP instructions, there is no possible advantage to be gained.
There are no rules for the register types and/or the operation codes. These depend upon the co-processor.