PowerPC assembly language beginners guide.
"The Sun's power is dwarfed when compared to that of a single, human mind, but..."
What is machine code and what is assembly language - does it look like Java?
Machine code is the code that your computer actually runs. Processors don't understand English, Swedish, Basic, C or anything else recognizable to humans (if C is?). Only ones and noughts (binary, meaning "two possible states") - machine code is ones and noughts. Luckily, nobody has to program in machine code any more. We use assemblers and compilers, but we do need to know what machine code is.
When your computer is running, it continuously gets instructions from memory. These instructions tell the processor what to do now. On a PowerPC based machine, the instructions are made up of thirty two binary digits or "bits". A bit can be in one of two states; either a 1 or a 0. 32 bits together are called a word.
The pattern of the word will tell the processor how to execute the instruction. A certain word may add two numbers together, whilst another word could make the processor store a number back in memory, and so on. A program consists of lots of instructions strung together, and of course, some data for the instructions to operate on.
Programs can be written in many kinds of different languages, some common ones are BASIC (Beginners All purpose Symbolic Instruction Code), C, and Pascal. These are termed high level languages. High level languages have to translate the terms used by the programmers into machine code. This is fine, except that the high level languages only know a certain way of doing things, and sometimes have to string together significantly long sets of ones and noughts to get the desired end result. What this means in practice is that any compiled language, will only be as good as the compiler.
Assembly language on the other hand is a low level language. Each assembly language mnemonic translates directly into a machine code instruction. The difference between a high level language such as C and assembly language is that an intelligent being is doing the compiling. Because a human is writing the code and not a machine, the code can be written in the best way to achieve maximum speed, and use the minimum of memory. For this reason assembly language in the hands of a competent programmer will always be a lot faster than any compiled language.
Just a little note here, and this has nothing to do with this guide, but... The acronym TWAIN stands for Technology Without An Interesting Name - I just thought you might not know that and find it "Interesting". Oh well, onwards...
What is a mnemonic?
The dictionary defines it as "something to help the memory" - in this case its a word that represents a machine code instruction.
The PowerPC family of processors understands about 60 basic instructions, which means the PowerPC assembly language programmer has 60 different mnemonics to remember. Compared to a high level language, which may have as many as 800 different instructions (BASIC for example), this is a small number to memorize. In practice not all these instructions will be used, and you may find yourself using as few as 10 instructions regularly, so its not difficult to learn.
What does a mnemonic look like?
The PowerPC has an instruction to add two numbers together and store the result away. In ones and noughts it may look like 001110010000100001101000000000000 (don't quote me on that!), which is a complete and utter mouthful, so we use the mnemonic "add". We get an assembler totranslate the "add" to the required machine code. Its as simple as that.
Some other useful mnemonics include:
LI - Load immediate - loads a number into a processor register.
STW - Store a word from a processor register into memory
LWZ - Load a word from memory into a processor register.
Thus, assembly language uses mnemonics to represent machine instructions, or code.
Basic processor operation
The goal of the processor is to take data (the input), perform some form of processing on the data, and then store the data in a useful way (output). To be able to do this, the processor needs temporary storage within itself called registers. The PowerPC family all have two types of registers - integer registers identified as r0 to r31 and floating point registers named f0 to f31. We won't be looking at the floating point registers for a while yet, but they are one of the most powerful aspects of the PowerPC family.
The processor can put data into these registers and then use the registers as inputs to calculations or other operations. The result of the operation can then be stored in another register, or one of the input registers. For example, the instruction add r3,r4,r5 adds register 5 to register 4 and stores the result in register 3, whereas add r5,r4,r5 adds the contents of register 5 to register 4 and stores the result back in register 5.
To show you how easy it is, examine the following PowerPC assembly language "snippet".
do_add: li r3,10 *load register 3 with the number 10 li r4,20 *load register 4 with the number 20 add r5,r4,r3 *add r3 to r4 and store the result in r5 stw r5,sum(rtoc) *store the contents of r5 (i.e. 30)into the memory location *called "sum" blr *end of this piece of code sum: ds.w 1 *define the storage we need for the result.
The first thing to notice is the layout. All assembler languages tend to be like this. The doing bits of the instructions (store, add, etc) - the actual instruction, is in the "second column". Technically it is proceeded by white space - either tabs or space characters. This way the assembler knows they are instructions.
So what is the word "do_add:" up against the left hand edge of the page? A Label. Labels mark a position in the code and are also used when we need to define things, such as data. Notice the label has a colon : after it. When you are referencing these labels in the code, you don't actually use the colon - for example in the line "stw r5,sum". In this case, the label "do_add:" is used as an identifier, or name, for this piece of code. Labels are used instead of real memory addresses because we don't care what address is assigned to the label, that's the assembler's worry. If the assembler knows the address of the label, then we can use the labels in our program and let Fantasm worry about tying up all the addresses. If we want to run a routine called fred, then we can just branch to fred. The assembler will work out what fred's address is and do the right thing.
The colon after the label is not essential when you are defining a label in the first column,but it helps in two ways. Firstly it clearly identifies this thing as a label to the assembler, the editor AND us humans. Secondly, it can be helpful when you are editing the program, and need to find a specific label. If you search for just "sum" you will find all the places that access sum - but if you search for "sum:" you will go to the place where the label "sum" is defined. (note, that Anvil does not need colons at the end of labels to be able to hyperjump to them).
Labels are always placed right up against the left hand edge of the window; there must be no white space in front of them.. The label is followed by some "white space", normally a tab character - the tab key on your keyboard, although a normal space, or run of spaces is fine.
Not all lines need a label, just those lines of your program that you want to branch to, or lines that define data that you want to reference by name.
Following the optional label comes the instruction. This tells the processor what it's going to be doing. The instruction is followed by some more white space before the "operands". These are the things the instruction manipulate. PowerPC instructions sometimes have three operands or more, and sometimes none at all - it depends on the instruction. If there are operands, then the first is either the source or destination of the result of the operation. The other two provide the data to be worked on - for example add r3,r4,r5; in this case, r5 is added to r4 and the result is placed in r3.
Any ideas about what this program does? It adds 20 and 10 and puts the result in sum. Here's the breakdown.
Line 1 - do_add: li r3,10
The 'Add' is a 'label'. Labels are used to reference lines of a program so we can change the program flow, by 'branching' to labels. Think of it as a name for this part of the program.
li means load immediate. In this case it means move the word '10' (remember that a word is 32 bits) into the processor register 3.
Line 2 - li r4,20
This line has no label, and therefore can't be branched to.
The instruction moves another number,in this case 20, into the processor register 4.
Line 3 - add r5,r4,r3
This line instructs the processor to add the two numbers together and place the result in register number 5.
Line 4 - stw r5,sum(rtoc)
This instruction moves the result of the addition into memory. Where in memory? At "sum".
Line 5 - blr
This instruction returns from this part of the program. if you like it means that's the end of this piece of code so go back to whatever piece of code called this piece of code. "blr" is a mnemonic for branch to link register. Generally, whenever we jump to a new piece of code, the address of the calling code is stored in the link register so the code can return when finished.
Line 6 - sum: ds.w 1
This line is used to define the location of 'sum' in memory. Its not an instruction to the processor, as the processor has finished with this code in line 5. This line is used by Fantasm to set up 'sum' in memory ready for the program when it is run. The 'sum' tells Fantasm that this is the name we want to use. Again it is a label.
The ds.w means define space as words. Fantasm reserves the number of words needed for this label in memory. This is called a 'directive', meaning that it is a directive to the assembler (Fantasm), and not a mnemonic to be translated into machine code.
If you found that complicated, don't worry as we'll come back to directives and program structure later.
If you had trouble understanding that, read through it again. There's nothing devious or particularly clever about assembly language programming - just common sense most of the time. Staying awake is important too.
If after rereading the above you still have difficulty understanding it, don't worry about it, just carry on with this text, as it was throwing you in at the deep end, however, if you followed it just fine, example project #1 will build the code and throw you into the low level debugger (if you have one installed).
This is what the above code, as a complete program, will look like in Anvil (your colors may be different) :
As you may already know, all computers have 'memory'. Memory is a very wide term, that can be broken down as follows.
1. Long term memory. This type of memory is for long term storage of information.
Theoretically speaking it means any form of memory that doesn't forget when you switch the power off. This boils down to disks, both floppy and hard. Once the data is written to disk, it stays there until its either overwritten, or your four year old decides to open the case on one of your floppies and play frisbee with what's inside.
There are big differences between floppy disks, and hard disks. Floppy disks are removable, cheap and slow. If you slide the metal cover of a floppy you'll see a brown plastic disk inside the jacket. This is the actual floppy disk. Data is recorded onto the disk, in the same way that music is recorded onto a cassette, one bit at a time using a magnetic recording head.
Because the disk is arranged in tracks (80 on a HD disk), and your Mac can select any one of these tracks at random, you don't have to fast forward over the whole disk to get at the data you want.
The big disadvantage with floppy disks is that they are very slow compared to hard disks.
Hard disks are not (generally) removable, not cheap and not so slow.
A hard disk works in a similar way to floppy disks. They have tracks, and head(s), but because the disk spins a lot faster (5400 R.P.M. and upwards) and the heads are a lot closer to the disk, you can store a lot more information on them.
Hard disks have a reputation for being fragile. This is not just paranoia about them, but a fact. In a hard disk, to get the heads as close as possible to the recording surface, the heads fly on a cushion of air created by the disk spinning. If the hard disk drive is knocked whilst it is in use, its quite possible for the heads to crash into the disk surface! This is a good reason why you should use the shutdown menu item in the Finder, so the drive can move the heads away from the disk before the power goes off and the disk slows down. Most hard drives automatically move the heads away from the disk when the power goes off (this is known as auto parking), but just in case...
All disks, whether hard or floppy segment the tracks up into sectors. This makes the drives more usable. For example if you have a high density floppy disk, capable of holding 1.4 megabytes (1 megabyte is 1 million bytes), and the floppy disk itself has 80 tracks, this means that each track can hold 0.0175 megabytes, or 17.5 kilobytes.
This means that the smallest amount of information that could be written to a floppy disk is 17.5K. If a 1k file was saved to disk, it would still take up 17.5K on the floppy! If however, each track is further split up into sectors, and the drive knew where each of these sectors were, then small files would take up less space. For example if the track was split up into 20 sectors, then the smallest addressable unit on the floppy would be 17.5 K divided by 20, which is 875 bytes. Now a 1k file only takes up 2 sectors on disk amounting to 1.75K.
Getting data off a disk, whether it's a floppy, a hard disk, a CD or any other kind of storage medium is a slow process - far too slow for the processor. The data in long term memory, such as a program, is transferred to short term memory before it can be run, or accessed by the processor.
The other type of long term memory, associated with Macintoshes, is pram, or parameter random access memory. This memory doesn't forget its data when a Mac is switched off because it has a battery that keeps the memory running. In here the Mac stores vital information; its configuration, so that then next time its switched on, it can read the set up information from pram, and configure itself exactly as it was. Examples of the data stored in pram are the sound volume, how many flashes a menu bar makes, keyboard repeat speed etc.
(Other computers may have this type of memory. PC-compatibles use a type of memory that is usually referred to as CMOS (complementary Metal Oxide Semiconductor) memory. Using this name is a bit naughty, since it should be called "battery backed C.M.O.S. memory" - C.M.O.S. memory itself loses its contents without power. In PC's this stores things like hard drive type, floppy types, time and date - general set up in formation., which drive is called drive C: (yeah, real hi-tech stuff) etc.)
2. Short term memory - this is the memory the processor has direct access to. It is split into random access memory (RAM) and read only memory (ROM). Ram can be written to and read from, whereas ROM can only be read from. RAM is generally faster than ROM.
The more RAM you have, the more data can be stored inside the computer at any one time. If the data is a program, then the more ram you have, the bigger the program you can run.
As was noted previously, the PowerPC range of processors use instructions made up from 32 bits. The basic unit of memory is a byte, which is 8 bits, so the PowerPC needs to read four bytes for every basic instruction.
RAM and ROM. are fast enough for the processor to run programs from. We'll come back to memory later when we talk about caches, but for now that's enough.
As you know, computers these days are referred to as being digital, but a long time ago there were analog computers that did their calculations by whizzing around servos linked to potentiometers which would give a result as a voltage. Anyway, "digital" means numerical, so it stands to reason that computers run by using numbers? Quite true, as a number is just a number, but the clever bit is that numbers can also be used to represent codes for such things as what operation to perform., or a letter, or a sound volume etc.
So if a computer needs numbers to strut its stuff, what form are these numbers held in, do we just say "65", and it puts letter A up on the screen in glorious Technicolor? Not quite. The numbers the computer needs to carry out processing are held in the binary format, just the same as the instructions the computer executes. Binary is a number system in which a number can be one of two values, either zero or one, whereas denary means a number can have one of ten values, 0 through to 9.
How can we represent numbers larger than 1 in binary?
Well, lets take a look at our human denary system first.
If we start counting from zero upwards, we eventually get to 11, which means 10 plus 1, then when we get to 101, this means one hundred, no tens and one unit. If we term tens, hundred's and thousands as "multipliers", then any number can be expressed in terms of its multipliers;1234 can be expressed as:
1 times 10^3
2 times 10^2
3 times 10^1
(note that in programming, * generally means multiply and ^ means to the power of , so 10^3=10*10*10).
As the computer uses binary and not denary, numbers can be expressed as powers of 2 instead of ten. Where in denary the multipliers go 1,10,100,1000,10000 etc, in binary the multipliers go 1,2,4,8,16,32,64,128,256 etc.
It is helpful to be able to remember the multipliers up to a certain level, here's a list you should try to learn:
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536
To express the number 13 in binary, break it down into powers of two.
13 is one 8, one 4, and 1 -
8 4 2 1
1 1 0 1
Thus, 13 is 1101 in binary.
We can take a very short break here, just to clarify some terms that are about to appear. Binary digits, 1's and 0's are termed "bits", which is just a short form of "binary digit". It's not very hip to keep shouting "Binary digit 3" when "Bit 3" sounds far cooler and saves time. When referring to bit locations within a group of bits (for example, within a word), bits are numbered from zero upwards, with bit zero being the rightmost bit. If you confuse left with right as I sometimes do, now is a good time to finally get it sorted out. Gettwo of those little "Post it(tm)" notes, and write "Left" on one and stick it on the left side of your monitor, and do the same for the right side, only this time the note should have "Right" written on it. Ok, so if we take a byte (which is 8 bits) then the bits are numbered as follows:
76543210 - bit 7 on the left and bit 0 on the right.
Confusingly, IBM's books for PowerPC sometimes refer to bits the other way round, but hey, that's IBM.
Ok, back to the lesson....
Now try converting decimal 255 to binary.
Start with the multiplier above 255, 256. This is too great, so try 128.
255 divided by 128=1 remainder 127. So we have a 128.
127 divided by 64 =1 remainder 63.
63 divided by 32 = 1 remainder 31
31 divided by 16 = 1 remainder 15
15 divided by 8 = 1 remainder 7
7 divided by 4 = 1 remainder 3
3 divided by 2 = 1 remainder 1
1 divided by 1 = 1 remainder 0
Therefore 255 in binary is 11111111.
The last example is 471. 512 is too big, so:
471 divided by 256 = 1 remainder 215
215 divided by 128 = 1 remainder 87
83 divided by 64 = 1 remainder 23
23 divided by 32 = 0 (because 32 doesn't divide into 23 even once. i.e. it won't "go")
23 divided by 16 = 1 remainder 9
7 divided by 8 = 0 (because it won't go)
7 divided by 4 = 1 remainder 3
3 divided by 2 = 1 remainder 1
1 divided by 1 = 1
Thus 471 in binary is:
111010111 - that's nine bits and it means that 1*1+1*2+1*4+1*16+1*64+1*128+1*256=471 or more simply
I admit, it isn't easy, but here are some more examples expanded to 8 bits.
128 64 32 16 8 4 2 1 25 = 0 0 0 1 1 0 0 1 129= 1 0 0 0 0 0 0 1 56 = 0 0 1 1 1 0 0 0 90 = 0 1 0 1 1 0 1 0
In Fantasm, to show that a number is binary, we precede it with a percent sign - %10101010
Eight bits are termed a byte. One byte can hold 256 different values, so every conceivable letter and punctuation mark can be defined in one byte, or alternatively, 256 different codes can be defined, or 256 different colors for a pixel.
As we know, the PowerPC demands its instructions in 32 bit chunks, these are called words - how many possible values are there with 32 bits?
2^32 = 4294967296. (easy with a calculator).
With 32 bits 4294967296. different values can be defined.
With 32 bits making up a basic PowerPC instruction, there are possibly over 4294967296 different instructions the processor could execute - in practice there are about 60 or so basic instructions. The other bits within the instruction are used to hold the data the instruction works on.
For data, 32 bits can hold big numbers, sometimes it's too wasteful; for example to store the character "A" we only need 8 bits, or a byte. And sometimes you may want an intermediate size, or 16 bits (65536 possible values), so we also have data sizes called "halfs" which is half a word. More on this later.
Hex is just a number system, the same as decimal and binary. In decimal we use base 10, in binary we use base 2, and in hex we use base 16. How can we use base 16 if we only have ten digits (0-9). Good question. The digit set is extended to 15 by using the letters A-F.
If you can understand binary, hex isn't a big deal. Each hex digit represents 4 bits or a nibble.
An example is probably easiest to understand:
Here's an easy one:
255 in binary is 11111111 (8 ones).
To get 255 in hex first convert to binary, then split it up into nibbles (i.e. half bytes),
Each hex digit is a nibble, so 1111 is 15 in decimal, or F in hexadecimal.
Therefore 255 is 11111111 in binary or FF in hex. To show this is a hex number we precede it with a dollar sign - $FF or the C language standard of "0x" - 0xFF. The choice of preceding hex number with either a "$" or "0x" is up to you.
To convert from hex to decimal, convert the hex to binary, then decimal:
convert $FACE to decimal -
F A C E 1111 1010 1100 1110 32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1 1 1 1 1 1 0 1 0 1 1 0 0 1 1 1 0
then add together all the ones:
32768+16384+8192+4096+2048+512+128+64+8+4+2 = 64206
Now try $9276
You should get 37494. Of course the quickest way is to use a the computer to do it for you and "drop" into Macsbug! (a debugger, covered later).
The PowerPC family can manipulate numbers as either bits, bytes, halfwords, words and doubles (64 bit via the floating point unit, or FPU).
Now that we know a little of how the computer uses numbers, how are they stored in memory? Because the numbers are represented in binary, which is a string of ones and noughts, its easy to go from theory to practice. In memory, a 1 is represented by a voltage, whilst a nought is represented by no voltage. The memory is laid out in bytes (8 bits), so for the processor to get one instruction it needs to read four bytes.
Four bytes make up one word, so it stands to reason that words must live in quad aligned locations in memory, if the memory starts at location 0 (which it does). Thus if our first instruction was at address 0x80000, then our next instruction will be at 0x80004, then next at 0x80008 and the next at 0x8000C etc
If you really want to know more about the intricacies of the floating point number formats, go check with Random Rob from the Programmers Dream.
A quick summary:
The number system used in computer is binary. Binary means one of two values. Either a digit is a one or a nought. When counting in binary, the power of two is applied as a multiplier. A bit is one binary digit, either a 1 or 0. A byte is 8 binary digits that can hold 256 possible values. A half is 16 bits that can hold 65536 possible values, and a word is 32 bits that can hold very big values.
Binary numbers are preceded by a percent sign - % and may contain the digits 1 and 0 only.
Decimal numbers are written as per normal and may contain the digits 0,1,2,3,4,5,6,7,8,9.
Hexadecimal number are preceded with either a $ or an 0x - your choice and may contain the characters 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F.
Addressing is the term given to this question. "How does the processor know where the data is and how does the data get from the memory to the processor and back again?"
All computers have busses; highways for information. Typically there is a data bus and an address bus. The address bus tells the memory where the processor wants to read or write to. To get a word from memory location 1000, the processor puts 1000 on the address bus, then tells the memory to "read". The memory system will put the data on the data bus, and the processor can read it in. From now on, instead of the word "location", we'll use the word "address" to mean a location that the processor can access.
Inside the processor the address can either be stored in a "register" or form part of an instruction - for example lwz r3,fred(rtoc) - the address we are reading from is "fred". The PowerPC has 32 integer registers, each being 32 bits wide. Programs can modify these registers, so that the processor can keep temporary pointers to memory locations. A register is like a small piece of memory that is internal to the processor, and hence very fast. A PowerPC processor also has 32 floating point registers but we wont concern ourselves with these just yet.
An important point to realize is that it is not only memory and the processor that can access the address and data busses. Most peripherals, such as disk controllers, keyboards, screen driver hardware can also access these busses.
These peripherals are normally given a memory address that's well out of the way of the main program and data memory, so if the computers main memory ends at address $1000000, the peripheral hardware addresses may start at $80000000. If the video driver hardware lives at $80000000, the processor can send and read data by reading and writing to this address.
We'll come back to peripherals later.
There is a special register called the program counter. This one keeps track of where in a program the processor is. Normally it increments by the size of each instruction, as each instruction is read in - that is it increments by four bytes (32 bits) after reading the current instruction so it points to the next instruction. Thus, if the program starts at address 1000, after the processor has read the first instruction, the program counter will be pointing to location 1004.
Normally, the program counter (PC) is incremented by four to point at the next instruction. However programs need a way of making decisions, and going off to do something else if need be. This is called branching or jumping. As an example consider a program that accepts names from the keyboard until ten names have been entered.
The program could go something like this:
step 1: get name 1 from keyboard
step 2: print the name on the screen
step 3: get name 2 from keyboard
step 4: print the name on screen
step 5: get name 3 from the keyboard
step 6: print name....
step 7: get name 4....
step 8: and so on....
As you can see the program is a repetition of steps 1 and 2. What would be nice is if we had a way of using steps 1 and 2 ten times over. By using a conditional check and a counter we can:
step 1: set counter to 1
step 2: get name from keyboard
step 3: print name on screen
step 4: add 1 to the counter
step 5: is the counter equal to 10? This is the conditional check
step 6: if no, then go to step 2
step 7: end the program
Step 6 is a conditional branch - it is taken if the condition is met - if the counter doesn't equal 10 then branch to step 2.
The processor would have to scrap the contents of the program counter and replace it with the memory location for the instruction at step 2.
To be able to perform conditional branching, or jumping, the processor has to have a method of flagging the result of operations.
In the above program the processor needs to know if the counter had reached 10.
How does it do this? It compares the value of the counter to 10. If it
equals 10 the processor sets a flag in the "condition code" register.
At step 6 this flag is checked. if the flag isn't set, then the program
can branch back to step 2.
A compare is simply a subtract operation, but the processor just makes a note of the result (was it positive, negative, equal to zero etc.) and throws the result away.
Hopefully, we have now covered enough ground to be able to summarize how a computer works and the basics of a PowerPC processor as follows:
1. The computer reads and executes instructions.
2. The instructions act on data.
3. Instructions are read from memory via the data bus. The address in memory from where the instruction is coming from is set up by the processor on the address bus.
4. Data can be read and written to memory via the data bus. Again, the address in memory of where the data is coming from or going to is set up on the address bus.
5. Data coming in from memory to the processor is termed as being read
6. Data going out from the processor to memory is termed as being written.
7. By reading and writing data from certain areas of memory, the processor can control peripherals.
8. The processor knows where to get the next instruction from because the program counter register always points to (holds the address of) the next instruction to be executed.
9. The value in the program counter can be altered as a result of conditional checks during the running of a program.
10. The link register can be used to hold the return address when the processor decides to jump to another piece of code. This return address can be jumped to by executing a blr instruction.
11. The PowerPC processor has 32 general purpose registers. These registers are 32 bits wide.
12. The PowerPC processor has 32 floating point registers. These registers are 64 bits wide.
13. The PowerPC processor instructions generally take three operands.
14. Binary is a number system based around the multiplier 2.
15. Hexadecimal is a number system based around the multiplier 16. The numbers 10 through 15 are identified with the letter "A" to "F". A=10, B=11, C=12 etc.
16. When identifying bit positions, bit zero is the rightmost bit.