Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture (9 page)

Read Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture Online

Authors: jon stokes

Tags: #Computers, #Systems Architecture, #General, #Microprocessors

BOOK: Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture

11.79Mb size Format: txt, pdf, ePub

the destination field is set to a nonzero value, and the offset is stored in the second byte. Again, the base address for a register-relative store can theoretically be stored in any register other than A, although by convention it’s

stored in D.

Translating an Example Program into Machine Language

For our simple computer with four registers, three instructions, and 256

memory cells, it’s tedious but trivial to translate Program 1-1 into machine-

readable binary representation using the previous tables and instruction

formats. Program 2-1 shows the translation.

Line

Assembly Language

Machine Language

load #12, A

10100000 00001100

load #13, B

10100001 00001101

add A, B, C

00000001 10000000

store C, #14

10111000 00001110

Program 2-1: A translation of Program 1-1 into machine language

The 1s and 0s in the rightmost column of Program 2-1 represent the

high and low voltages that the computer “thinks” in.

Real machine language instructions are usually longer and more complex

than the simple ones I’ve given here, but the basic idea is exactly the same.

Program instructions are translated into machine language in a mechanical,

predefined manner, and even in the case of a fully modern microprocessor,

doing such translations by hand is merely a matter of knowing the instruction

formats and having access to the right charts and tables.

Of course, for the most part the only people who do such translations by

hand are computer engineering or computer science undergraduates who’ve

been assigned them for homework. This wasn’t always the case, though.

The Mechanics of Program Execution

The Programming Model and the ISA

Back in the bad old days, programmers had to enter programs into the

computer directly in machine language (after having walked five miles in

the snow uphill to work). In the very early stages of computing, this was done

by flipping switches. The programmer toggled strings of 1s and 0s into the

computer’s very limited memory, ran the program, and then pored over the

resulting strings of 1s and 0s to decode the answer.

Once memory sizes and processing power increased to the point where

programmer time and effort were valuable enough relative to computing

time and memory space, computer scientists devised ways of allowing the

computer to use a portion of its power and memory to take on some of the

burden of making its cryptic input and output a little more human-friendly.

In short, the tedious task of converting human-readable programs into

machine-readable binary code was automated; hence the birth of
assembly

language
programming. Programs could now be written using mnemonics,

assembler
into machine language for processing.

In order to write assembly language programs for a machine, you have

to understand the machine’s available resources: how many registers it has,

what instructions it supports, and so on. In other words, you need a well-

defined model of the machine you’re trying to program.

The Programming Model

The
programming model
is the programmer’s interface to the microprocessor.

It hides all of the processor’s complex implementation details behind a

relatively simple, clean layer of abstraction that exposes to the programmer

all of the processor’s functionality. (See Chapter 4 for more on the history

and development of the programming model.)

Figure 2-7 shows a diagram of a programming model for an eight-register

machine. By now, most of the parts of the diagram should be familiar to you.

The ALU performs arithmetic, the registers store numbers, and the
input-

output unit (I/O unit)
is responsible for interacting with memory and the rest of the system (via loads and stores). The parts of the processor that we haven’t yet met lie in the
control unit
. Of these, we’ll cover the
program counter
and the
instruction register
now.

The Instruction Register and Program Counter

Because programs are stored in memory as ordered sequences of instruc-

tions and memory is arranged as a linear series of addresses, each instruction

in a program lives at its own memory address. In order to step through and

execute the lines of a program, the computer simply begins at the program’s

starting address and then steps through each successive memory location,

fetching each successive instruction from memory, placing it in a special

Chapter 2

Control Unit

Registers

Program Counter

Instruction Register

Proc. Status Word (PSW)

Data Bus

I/O Unit

ALU

Address Bus

Figure 2-7: The programming model for a simple eight-register machine

The instructions in our DLW-1 computer are two bytes long. If we

assume that each memory cell holds one byte, then the DLW-1 must step

through memory by fetching instructions from two cells at a time.

Main Memory

Instruction

Registers

ALU

CPU

Figure 2-8: A simple computer with instruction and

data registers

The Mechanics of Program Execution

For example, if the starting address in Program 1-1 were #500, it would

look like Figure 2-9 in memory (with the instructions rendered in machine

language, not assembly language, of course).

#500 #501 #502 #503 #504 #505 #506 #507

load #12, A

load #13, B

add A, B, C store C, #14

Figure 2-9: An illustration of Program 1-1 in memory,

starting at address #500

The Instruction Fetch: Loading the Instruction Register

An
instruction fetch
is a special type of load that happens automatically for every instruction. It always takes the address that’s currently in the program counter register as its source and the instruction register as its destination. The control unit uses a fetch to load each instruction of a program from memory into the

instruction register, where that instruction is
decoded
before being executed; and while that instruction is being decoded, the processor places the address

of the next instruction into the program counter by incrementing the address

that’s currently in the program counter, so that the newly incremented address

points to the next instruction the sequence. In the case of our DLW-1, the

program counter is incremented by two every time an instruction is fetched,

because the two-byte instructions begin at every other byte in memory.

Running a Simple Program: The Fetch-Execute Loop

In Chapter 1 we discussed the steps a processor takes to perform calculations

on numbers using the ALU in combination with a fetched arithmetic instruc-

tion. Now let’s look at the steps the processor takes in order to fetch a series of instructions—a program—and feed them to either the ALU (in the case of

arithmetic instructions) or the memory access hardware (in the case of loads

and stores):

Fetch
the next instruction from the address stored in the program counter, and load that instruction into the instruction register. Increment the

program counter.

Decode
the instruction in the instruction register.

Execute
the instruction in the instruction register, using the following rules:

If the instruction is an arithmetic instruction, execute it using the

ALU and register file.

If the instruction is a memory access instruction, execute it using

the memory-access hardware.

These three steps are fairly straightforward, and with one modification

they describe the way that microprocessors execute programs (as we’ll see

in
the section “Branch Instructions” on page 30).
Computer scientists often
28

Chapter 2

refer to these steps as the
fetch-execute loop
or the
fetch-execute cycle
. The fetch-execute loop is repeated for as long as the computer is powered on. The

machine iterates through the entire loop, from step 1 to step 3, over and over

again many millions or billions of times per second in order to run

programs.

Let’s run through the three steps with our example program as shown

in Figure 2-9. (This example presumes that #500 is already in the program

counter.) Here’s what the processor does, in order:

Fetch the instruction beginning at #500, and load load #12, A into the

instruction register. Increment the program counter to #502.

Decode load #12, A in the instruction register.

Execute load #12, A from the instruction register, using the memory-

access hardware.

Fetch the instruction beginning at #502, and load load #13, B in the

instruction register. Increment the program counter to #504.

Decode load #13, B in the instruction register.

Execute load #13, B from the instruction register, using the memory-

access hardware.

Fetch the instruction beginning at #504, and load add A, B, C into the

instruction register. Increment the program counter to #506.

Decode add A, B, C in the instruction register.

Execute add A, B, C from the instruction register, using the ALU and

10. Fetch the instruction at #506, and load store C, #14 in the instruction

11. Decode store C, #14 in the instruction register.

12. Execute store C, #14 from the instruction register, using the memory-

access hardware.

NOTE

To zoom in on the execute steps of the preceding sequence, revisit Chapter 1, and
particularly the sect
ions“Refining the File-Clerk Model” on page 6
and “RAM: When

Registers Alone Won’t Cut It” on page 8.
If you do, you’ll gain a pretty good understanding of what’s involved in executing a program on any machine. Sure, there are
important machine-specific variations for most of what I’ve presented here, but the general outlines (and even a decent number of the specifics) are the same.

The Clock

Steps 1 through 12 in the previous section don’t take an arbitrary amount of

time to complete. Rather, they’re performed according to the pulse of the

clock that governs every action the processor takes.

This clock pulse, which is generated by a
clock generator
module on the

motherboard and is fed into the processor from the outside, times the func-

tioning of the processor so that, on the DLW-1 at least, all three steps of the

fetch-execute loop are completed in exactly one beat of the clock. Thus, the

The Mechanics of Program Execution

program in Figure 2-9, as I’ve traced its execution in the preceding section,

takes exactly four clock beats to finish execution, because a new instruction is fetched on each beat of the clock.

One obvious way to speed up the execution of programs on the DLW-1

would be to speed up its clock generator so that each step takes less time to

complete. This is generally true of all microprocessors, hence the race among

microprocessor designers to build and market chips with ever-higher clock

speeds. (We’ll talk more about the relationship between clock speed and

performance in Chapter 3.)

Branch Instructions

As I’ve presented it so far, the processor moves through each line in a pro-

gram in sequence until it reaches the end of the program, at which point the

program’s output is available to the user.

There are certain instructions in the instruction stream, however, that

allow the processor to jump to a program line that is out of sequence. For

instance, by inserting a
branch instruction
into line 5 of a program, we could cause the processor’s control unit to jump all the way down to line 20 and

begin executing there (a
forward branch
), or we could cause it to jump back up to line 1 (a
backward branch
). Because a program is an ordered sequence of instructions, by including forward and backward branch instructions, we

can arbitrarily move about in the program. This is a powerful ability, and

branches are an essential part of computing.

Rather than thinking about forward or backward branches, it’s more

useful for our present purposes to categorize all branches as being one of the

following two types: conditional branches or unconditional branches.

Unconditional Branch

An
unconditional branch
instruction consists of two parts: the branch instruction and the target address.

jump #target

For an unconditional branch, #target can be either an immediate value,