Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture (8 page)

Read Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture Online

Authors: jon stokes

Tags: #Computers, #Systems Architecture, #General, #Microprocessors

BOOK: Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture

8.71Mb size Format: txt, pdf, ePub

Read Book Download Book

of the program’s immediate address values have to be changed to reflect the

data segment’s actual location in memory.

Because both memory addresses and regular integer numbers are stored

in the same registers, these registers are called
general-purpose registers (GPRs)
.

On the DLW-1, A, B, C, and D are all GPRs.

Basic Computing Concepts

T H E M E C H A N I C S O F P R O G R A M

E X E C U T I O N

Now that we understand the basics of computer organi-

zation, it’s time to take a closer look at the nuts and

bolts of how stored programs are actually executed by

the computer. To that end, this chapter will cover

core programming concepts like machine language,

the programming model, the instruction set architec-

ture, branch instructions, and the fetch-execute loop.

Opcodes and Machine Language

If you’ve been following the discussion so far, it shouldn’t surprise you to

learn that both memory addresses and instructions are ordinary numbers

that can be stored in memory. All of the instructions in a program like

Program 1-1 are represented inside the computer as strings of numbers.

Indeed, a program is one long string of numbers stored in a series of

memory locations.

How is a program like Program 1-1 rendered in numerical notation so

that it can be stored in memory and executed by the computer? The answer

is simpler than you might think.

As you may already know, a computer actually only understands 1s and

0s (or “high” and “low” electric voltages), not English words like
add
,
load
, and
store
, or letters and base-10 numbers like A, B, 12, and 13. In order for the computer to run a program, therefore, all of its instructions must be rendered

in
binary notation
. Think of translating English words into Morse code’s dots and dashes and you’ll have some idea of what I’m talking about.

Machine Language on the DLW-1

The translation of programs of any complexity into this binary-based
machine
language
is a massive undertaking that’s meant to be done by a computer, but I’ll show you the basics of how it works so you can understand what’s going

on. The following example is simplified, but useful nonetheless.

The English words in a program, like
add
,
load
, and
store
, are
mnemonics
(meaning they’re easy for people to remember), and they’re all mapped to

strings of binary numbers, called
opcodes
, that the computer can understand.

Each opcode designates a different operation that the processor can perform.

Table 2-1 maps each of the mnemonics used in Chapter 1 to a 3-bit opcode

for the hypothetical DLW-1 microprocessor. We can also map the four

Table 2-1:
Mapping of Mnemonics to

Opcodes for the DLW-1

Mnemonic

Opcode

add

000

sub

001

load

010

store

011

Table 2-2:
Mapping of Registers to

Binary Codes for the DLW-1

Register

Binary Code

The binary values representing both the opcodes and the register codes

are arranged in one of a number of 16-bit (or 2-byte) formats to get a complete

machine language instruction,
which is a binary number that can be stored in RAM and used by the processor.

Chapter 2

NOTE

Because programmer-written instructions must be translated into binary codes before
a computer can read them, it is common to see programs in any format—binary,
assembly, or a high-level language like BASIC or C, referred to generically as

“code” or “codes.” So programmers sometimes speak of “assembler code,” “binary
code,” or “C code,” when referring to programs written in assembly, binary, or C

language. Programmers also will often describe the act of programming as “writing
code” or “coding.” I have adopted this terminology in this book, and will henceforth
use the term “code” regularly to refer generically to instruction sequences and
programs.

Binary Encoding of Arithmetic Instructions

Arithmetic instructions have the simplest machine language instruction

formats, so we’ll start with them. Figure 2-1 shows the format for the machine

language encoding of a
register-type
arithmetic instruction.

mode

opcode

source1

source2

Byte 1

destination

000000

Byte 2

Figure 2-1: Machine language format for a register-type instruction

In a register-type arithmetic instruction (that is, an arithmetic instruc-

tion that uses only registers and no immediate values), the first bit of the

instruction is the
mode bit
. If the mode bit is set to 0, then the instruction is a register-type instruction; if it’s set to 1, then the instruction is of the immediate type.

Bits 1–3 of the instruction specify the opcode, which tells the computer

what type of operation the instruction represents. Bits 4–5 specify the instruc-

tion’s first source register, 6–7 specify the second source register, and 8–9

specify the destination register. The last six bits are not needed by register-to-register arithmetic instructions, so they’re padded with 0s (they’re
zeroed out
in computer jargon) and ignored.

Now, let’s use the binary values in Tables 2-1 and 2-2 to translate the add

instruction in line 3 of Program 1-1 into a 2-byte (or 16-bit) machine language

instruction:

Assembly Language Instruction

Machine Language Instruction

add A, B, C

00000001 10000000

The Mechanics of Program Execution

Here are a few more examples of arithmetic instructions, just so you can

get the hang of it:

Assembly Language Instruction

Machine Language Instruction

add C, D, A

00001011 00000000

add D, B, C

00001101 10000000

sub A, D, C

00010011 10000000

Increasing the number of binary digits in the opcode and register

fields increases the total number of instructions the machine can use and the

number of registers it can have. For example, if you know something about

binary notation, then you probably know that a 3-bit opcode allows the pro-

cessor to map up to 23 mnemonics, which means that it can have up to 23, or

8, instructions in its
instruction set
; increasing the opcode size to 8 bits would allow the processor’s instruction set to contain up to 28, or 256, instructions.

Similarly, increasing the number of bits in the register field increases the

possible number of registers that the machine can have.

Arithmetic instructions containing an immediate value use an
immediate-

type
instruction format, which is slightly different from the register-type format we just saw. In an immediate-type instruction, the first byte contains the

opcode, the source register, and the destination register, while the second

byte contains the immediate value, as shown in Figure 2-2.

mode

opcode

source

destination

Byte 1

8-bit immediate value

Byte 2

Figure 2-2: Machine language format for an immediate-type instruction

Here are a few immediate-type arithmetic instructions translated from

assembly language to machine language:

Assembly Language Instruction

Machine Language Instruction

add C, 8, A

10001000 00001000

add 5, A, C

10000010 00000101

sub 25, D, C

10011110 00011001

Chapter 2

Binary Encoding of Memory Access Instructions

Memory-access instructions use both register- and immediate-type instruction

formats exactly like those shown for arithmetic instructions. The only

difference lies in how they use them. Let’s take the case of a load first.

The load Instruction

We’ve previously seen two types of load, the first of which was the immediate

type. An immediate-type load (see Figure 2-3) uses the immediate-type

instruction format, but because the load’s source is an immediate value (a

memory address) and not a register, the source field is unneeded and must

be zeroed out. (The source field is not ignored, though, and in a moment

we’ll see what happens if it isn’t zeroed out.)

mode

opcode

destination

Byte 1

8-bit immediate source address

Byte 2

Figure 2-3: Machine language format for an immediate-type load

Now let’s translate the immediate-type load in line 1 of Program 1-1 (12 is

1100 in binary notation):

Assembly Language Instruction

Machine Language Instruction

load #12, A

10100000 00001100

The 2-byte machine language instruction on the right is a binary repre-

sentation of the assembly language instruction on the left. The first byte

corresponds to an immediate-type load instruction that takes register A as its

destination. The second byte is the binary representation of the number 12,

which is the source address in memory that the data is to be loaded from.

The second type of load we’ve seen is the register type. A register-type

load uses the register-type instruction format, but with the source2 field

zeroed out and ignored, as shown in Figure 2-4.

In Figure 2-4, the source1 field specifies the register containing the

memory address that the processor is to load data from, and the destination

field specifies the register that the loaded data is to be placed in.

The Mechanics of Program Execution

mode

opcode

source1

Byte 1

destination

000000

Byte 2

Figure 2-4: Machine language format for a register-type load

For a register-relative addressed load, we use a version of the immediate-

type instruction format, shown in Figure 2-5, with the base field specifying

the register that contains the base address and the offset stored in the second

byte of the instruction.

mode

opcode

base

destination

Byte 1

8-bit immediate offset

Byte 2

Figure 2-5: Machine language format for a register-relative load

Recall from Table 2-2 that 00 is the binary number that designates

encoding scheme, any register but A could theoretically be used to store the

base address for a register-relative load.

The store Instruction

The register-type binary format for a store instruction is the same as it is for a load, except that the destination field specifies a register containing a destination memory address, and the source1 field specifies the register contain-

ing the data to be stored to memory.

The immediate-type machine language format for a store, pictured in

Figure 2-6, is also similar to the immediate-type format for a load, except that since the destination register is not needed (the destination is the immediate

memory address) the destination field is zeroed out, while the source field

specifies which register holds the data to be stored.

Chapter 2

itm02_03.fm Page 25 Thursday, January 11, 2007 10:44 AM

mode

opcode

source

Byte 1

8-bit immediate destination address

Byte 2

Figure 2-6: Machine language format for an immediate-type store

The register-relative store, on the other hand, uses the same immediate-