Authors: jon stokes
Tags: #Computers, #Systems Architecture, #General, #Microprocessors
the destination field is set to a nonzero value, and the offset is stored in the second byte. Again, the base address for a register-relative store can theoretically be stored in any register other than A, although by convention it’s
stored in D.
Translating an Example Program into Machine Language
For our simple computer with four registers, three instructions, and 256
memory cells, it’s tedious but trivial to translate Program 1-1 into machine-
readable binary representation using the previous tables and instruction
formats. Program 2-1 shows the translation.
Line
Assembly Language
Machine Language
1
load #12, A
10100000 00001100
2
load #13, B
10100001 00001101
3
add A, B, C
00000001 10000000
4
store C, #14
10111000 00001110
Program 2-1: A translation of Program 1-1 into machine language
The 1s and 0s in the rightmost column of Program 2-1 represent the
high and low voltages that the computer “thinks” in.
Real machine language instructions are usually longer and more complex
than the simple ones I’ve given here, but the basic idea is exactly the same.
Program instructions are translated into machine language in a mechanical,
predefined manner, and even in the case of a fully modern microprocessor,
doing such translations by hand is merely a matter of knowing the instruction
formats and having access to the right charts and tables.
Of course, for the most part the only people who do such translations by
hand are computer engineering or computer science undergraduates who’ve
been assigned them for homework. This wasn’t always the case, though.
The Mechanics of Program Execution
25
The Programming Model and the ISA
Back in the bad old days, programmers had to enter programs into the
computer directly in machine language (after having walked five miles in
the snow uphill to work). In the very early stages of computing, this was done
by flipping switches. The programmer toggled strings of 1s and 0s into the
computer’s very limited memory, ran the program, and then pored over the
resulting strings of 1s and 0s to decode the answer.
Once memory sizes and processing power increased to the point where
programmer time and effort were valuable enough relative to computing
time and memory space, computer scientists devised ways of allowing the
computer to use a portion of its power and memory to take on some of the
burden of making its cryptic input and output a little more human-friendly.
In short, the tedious task of converting human-readable programs into
machine-readable binary code was automated; hence the birth of
assembly
language
programming. Programs could now be written using mnemonics,
register names, and memory locations, before being converted by an
assembler
into machine language for processing.
In order to write assembly language programs for a machine, you have
to understand the machine’s available resources: how many registers it has,
what instructions it supports, and so on. In other words, you need a well-
defined model of the machine you’re trying to program.
The Programming Model
The
programming model
is the programmer’s interface to the microprocessor.
It hides all of the processor’s complex implementation details behind a
relatively simple, clean layer of abstraction that exposes to the programmer
all of the processor’s functionality. (See Chapter 4 for more on the history
and development of the programming model.)
Figure 2-7 shows a diagram of a programming model for an eight-register
machine. By now, most of the parts of the diagram should be familiar to you.
The ALU performs arithmetic, the registers store numbers, and the
input-
output unit (I/O unit)
is responsible for interacting with memory and the rest of the system (via loads and stores). The parts of the processor that we haven’t yet met lie in the
control unit
. Of these, we’ll cover the
program counter
and the
instruction register
now.
The Instruction Register and Program Counter
Because programs are stored in memory as ordered sequences of instruc-
tions and memory is arranged as a linear series of addresses, each instruction
in a program lives at its own memory address. In order to step through and
execute the lines of a program, the computer simply begins at the program’s
starting address and then steps through each successive memory location,
fetching each successive instruction from memory, placing it in a special
register, and executing it as shown in Figure 2-8.
26
Chapter 2
Control Unit
Registers
Program Counter
A
B
Instruction Register
C
D
Proc. Status Word (PSW)
E
F
G
H
Data Bus
I/O Unit
ALU
Address Bus
Figure 2-7: The programming model for a simple eight-register machine
The instructions in our DLW-1 computer are two bytes long. If we
assume that each memory cell holds one byte, then the DLW-1 must step
through memory by fetching instructions from two cells at a time.
Main Memory
Instruction
Register
Registers
ALU
CPU
Figure 2-8: A simple computer with instruction and
data registers
The Mechanics of Program Execution
27
For example, if the starting address in Program 1-1 were #500, it would
look like Figure 2-9 in memory (with the instructions rendered in machine
language, not assembly language, of course).
#500 #501 #502 #503 #504 #505 #506 #507
load #12, A
load #13, B
add A, B, C store C, #14
Figure 2-9: An illustration of Program 1-1 in memory,
starting at address #500
The Instruction Fetch: Loading the Instruction Register
An
instruction fetch
is a special type of load that happens automatically for every instruction. It always takes the address that’s currently in the program counter register as its source and the instruction register as its destination. The control unit uses a fetch to load each instruction of a program from memory into the
instruction register, where that instruction is
decoded
before being executed; and while that instruction is being decoded, the processor places the address
of the next instruction into the program counter by incrementing the address
that’s currently in the program counter, so that the newly incremented address
points to the next instruction the sequence. In the case of our DLW-1, the
program counter is incremented by two every time an instruction is fetched,
because the two-byte instructions begin at every other byte in memory.
Running a Simple Program: The Fetch-Execute Loop
In Chapter 1 we discussed the steps a processor takes to perform calculations
on numbers using the ALU in combination with a fetched arithmetic instruc-
tion. Now let’s look at the steps the processor takes in order to fetch a series of instructions—a program—and feed them to either the ALU (in the case of
arithmetic instructions) or the memory access hardware (in the case of loads
and stores):
1.
Fetch
the next instruction from the address stored in the program counter, and load that instruction into the instruction register. Increment the
program counter.
2.
Decode
the instruction in the instruction register.
3.
Execute
the instruction in the instruction register, using the following rules:
a.
If the instruction is an arithmetic instruction, execute it using the
ALU and register file.
b.
If the instruction is a memory access instruction, execute it using
the memory-access hardware.
These three steps are fairly straightforward, and with one modification
they describe the way that microprocessors execute programs (as we’ll see
in
the section “Branch Instructions” on page 30).
Computer scientists often
28
Chapter 2
refer to these steps as the
fetch-execute loop
or the
fetch-execute cycle
. The fetch-execute loop is repeated for as long as the computer is powered on. The
machine iterates through the entire loop, from step 1 to step 3, over and over
again many millions or billions of times per second in order to run
programs.
Let’s run through the three steps with our example program as shown
in Figure 2-9. (This example presumes that #500 is already in the program
counter.) Here’s what the processor does, in order:
1.
Fetch the instruction beginning at #500, and load load #12, A into the
instruction register. Increment the program counter to #502.
2.
Decode load #12, A in the instruction register.
3.
Execute load #12, A from the instruction register, using the memory-
access hardware.
4.
Fetch the instruction beginning at #502, and load load #13, B in the
instruction register. Increment the program counter to #504.
5.
Decode load #13, B in the instruction register.
6.
Execute load #13, B from the instruction register, using the memory-
access hardware.
7.
Fetch the instruction beginning at #504, and load add A, B, C into the
instruction register. Increment the program counter to #506.
8.
Decode add A, B, C in the instruction register.
9.
Execute add A, B, C from the instruction register, using the ALU and
register file.
10. Fetch the instruction at #506, and load store C, #14 in the instruction
register. Increment the program counter to #508.
11. Decode store C, #14 in the instruction register.
12. Execute store C, #14 from the instruction register, using the memory-
access hardware.
NOTE
To zoom in on the execute steps of the preceding sequence, revisit Chapter 1, and
particularly the sect
ions“Refining the File-Clerk Model” on page 6
and “RAM: When
Registers Alone Won’t Cut It” on page 8.
If you do, you’ll gain a pretty good understanding of what’s involved in executing a program on any machine. Sure, there are
important machine-specific variations for most of what I’ve presented here, but the general outlines (and even a decent number of the specifics) are the same.
The Clock
Steps 1 through 12 in the previous section don’t take an arbitrary amount of
time to complete. Rather, they’re performed according to the pulse of the
clock that governs every action the processor takes.
This clock pulse, which is generated by a
clock generator
module on the
motherboard and is fed into the processor from the outside, times the func-
tioning of the processor so that, on the DLW-1 at least, all three steps of the
fetch-execute loop are completed in exactly one beat of the clock. Thus, the
The Mechanics of Program Execution
29
program in Figure 2-9, as I’ve traced its execution in the preceding section,
takes exactly four clock beats to finish execution, because a new instruction is fetched on each beat of the clock.
One obvious way to speed up the execution of programs on the DLW-1
would be to speed up its clock generator so that each step takes less time to
complete. This is generally true of all microprocessors, hence the race among
microprocessor designers to build and market chips with ever-higher clock
speeds. (We’ll talk more about the relationship between clock speed and
performance in Chapter 3.)
Branch Instructions
As I’ve presented it so far, the processor moves through each line in a pro-
gram in sequence until it reaches the end of the program, at which point the
program’s output is available to the user.
There are certain instructions in the instruction stream, however, that
allow the processor to jump to a program line that is out of sequence. For
instance, by inserting a
branch instruction
into line 5 of a program, we could cause the processor’s control unit to jump all the way down to line 20 and
begin executing there (a
forward branch
), or we could cause it to jump back up to line 1 (a
backward branch
). Because a program is an ordered sequence of instructions, by including forward and backward branch instructions, we
can arbitrarily move about in the program. This is a powerful ability, and
branches are an essential part of computing.
Rather than thinking about forward or backward branches, it’s more
useful for our present purposes to categorize all branches as being one of the
following two types: conditional branches or unconditional branches.
Unconditional Branch
An
unconditional branch
instruction consists of two parts: the branch instruction and the target address.
jump #target
For an unconditional branch, #target can be either an immediate value,