Features
Features:

Product Tour >

Edraw AI >

Paid Plans:

Individuals >

Business >

Eduaction >
Resources
Blog

History

How-tos & Tips

Discovery

Biography

Business Analysis

Examples

AI concept Map

Free AI Mind Map Generator

Onenote Mind Map

Bcg Matrix Examples

Nike Marketing Strategy

Unilever SWOT Analysis

Make Mind Maps in Google Docs

Guide

FAQs

What's New

Resource Center
Templates
All Templates

Brain Storming Templates

Strategy and Planning Templates

Project Management Templates

Product Management Templates

Human Resources Templates

Agile Workflow Templates

Marketing Templates

Education Templates

Fun and Games Templates

User Gallery
Download
Pricing
Enterprise

MindMap Gallery Mind map of computer composition principles

Mind map of computer composition principles

This is a mind map about the composition principles of 02 computers, including an overview, data operations, storage systems, central processing units, etc.

Edited at 2023-12-08 15:47:38

PlotWizard

Recent works View more works>>

Mind map of computer composition principles

PlotWizard

Recent works View more works>>

Recommended to you
Outline

Computer composition principles
- 9
- 1
- 1
PlotWizard
How the program runs
- 13
PlotWizard
Chapter 1 Computer System Overview
- 18
PlotWizard
Chapter 6 Bus
- 11
PlotWizard
Chapter 2 Data Representation and Operation
- 6
PlotWizard
Chapter 7 Input and Output System
- 8
PlotWizard
Chapter 3 Storage System
- 8
PlotWizard
Chapter 4 Command System
- 15
PlotWizard
Chapter 5 Central Processing Unit
- 19
PlotWizard

Computer composition principles

Chapter One Overview

computer system hierarchy

hardware

von Neumann computer

stored procedure

control flow driver

Programs and data compiled in advance are sent to the main memory Computer executes item by item

uniprocessor

Instructions and data have the same status

Instruction data are all binary codes

Operator-centered

modern computer

memory-centric

structure

input device

output device

memory

Main memory (internal storage)

memory bank

Auxiliary storage (external storage)

CPU

MAR (address memory)

The number of bits corresponds to the number of storage units

Store access address

MDR(data memory)

The number of bits is equal to the storage word length

Temporarily stores information read and written from memory

operator

ALU (Arithmetic Logic Unit)

combinational logic circuit

ACC (Accumulator)

MQ (Multiplier Quotient Register)

X general purpose register

Temporary storage of operands and intermediate results

IX (index register)

BE (Base Address Register)

PSW (Program Status Word Register/Flag Register)

Store the flag information obtained by the operation, such as: overflow, carry and borrow

controller

PC (Program Counter)

Stores the address of the instruction currently to be executed. After execution, it is automatically incremented by one to form the address of the next instruction.

Number of word length digits→Number of memory words

IR (Instruction Register)

Used to store the current command

Content from MDR

Instruction word length

CU (Control Unit)

software

composition

system software

operating systemOS

Database management system DBMS

language processing system

Network software system

service program

application

Daily use software

language

machine language

Computer can be executed directly

binary code

Assembly language

English words and abbreviation letters

Convert to machine language before execution

high level language

Java, c, c etc.

Source program to executable file

multi-level hierarchy

Interpreter program: Translate and execute at the same time

Performance

word length

Is an integer multiple of bytes (1B or 8bit)

machine word length

Data path width within the CPU for integer arithmetic

ALU long digits

Number of general register bits

Instruction word length

The number of binary code bits in an instruction word

Storage word length

Binary code length of one memory unit

data path bandwidth

The number of bits transmitted in parallel on the data bus at one time

Main memory capacity

calculating speed

Throughput

Number of requests processed per unit time

Main frequency

CPU clock frequency

How many HZ is it generally?

1hz means once per second

1GHZ = 10⁹

How many clock cycles per second

CPU clock cycle

Countdown of main frequency

1/main frequency

How many seconds is a clock cycle?

CPI

The number of clock cycles it takes to execute an instruction

How many clock cycles does it take to execute an instruction?

related to three factors

Instruction Set

Programming (system structure)

Computer organization (architecture)

CPU execution time

(number of instructions × CPI) / main frequency

MIPS

How many million (M) instructions are executed per second?

Main frequency / (CPI×10⁶)

FLOPS

mflops

How many millions of floating point operations are performed per second?

Number of floating point operations / (execution time × 10⁶)

gflops

billion

10⁹

tflops

trillions

10¹²

pflops

Quadrillions

10¹⁵

zflops

Transparency

Invisible content

For high-level language programmers, Instruction format, data operation process, etc. are invisible content

For machine language or assembly language programmers, Instruction format, machine structure, and data format are visible content

Programmers cannot view the contents of MAR, MDR, and IR in the CPU

Chapter two Data operations

base method

binary

The base is 2

Octal

Every eight moves into one

The trick: three-digit binary numbers

decimal

hexadecimal

123456789ABCDEF

Tip ①: Four-digit binary number

Tip ②: A represents 10

BCD code

8421 codes

Four binary digits represent a decimal number

Correction method:

① If the result of the two-digit operation is less than or equal to the decimal number 9, no correction will be made.

② If the result of the two-digit operation is greater than or equal to the decimal number 10, add the decimal number 6 (0110)

Three more yards

Add 3 to the 8421 code (0011) In this way, every number has a remainder of three

2421 codes

From the highest position to the lowest position, they are 2421 Such as: 5=(0101)

coded representation

Original code

0 means not unique

The word length is n 1, the range of the signed original code -(2ⁿ-1)≤ x ≤ 2ⁿ-1

reverse code

A positive number Original code=reverse code

negative number The sign bit remains unchanged and the value is inverted.

complement

0 means unique

A positive number Original code=reverse code=complement code

negative number Original code inverse code conversion: the original code is from right to left, find the first 1, invert all the numerical bits to the left of this 1, that is, get the complement code

Word length is n 1, signed complement range -2ⁿ ≤ x ≤ 2ⁿ-1

Stored in two’s complement form in C language

frameshift

0 means unique

exponent representing a floating point number

Offset

Negate the complement sign bit → frameshift

fixed point shift

arithmetic shift

For symbolic numbers

The sign bit remains unchanged when shifting

positive arithmetic shift Fill in 0 for original denies and supplements

negative arithmetic shift Fill in 0 for the original code The complement code is filled with 0 for the left shift and 1 for the right shift. Fill in the reverse code with 1

logical shift

For unsigned numbers

Logical left shift, high bit is lost, low bit is filled with zeros

Logical right shift, low bits are lost, high bits are filled with zeros

Addition and subtraction of fixed point numbers

Two's complement addition and subtraction

Addition direct operation

Subtraction A complement (-B) complement

Identifier

unsigned number

CF carry borrow

Determine unsigned number overflow

addition carry

Determine subtraction borrowing

Small - Large Borrowed seats available

ZF Zero mark

The result is 0, then ZF=1

signed number

ZF zero mark

SF Symbol Flag

The symbol that represents the result

OF overflow flag

Determine overflow of signed numbers

Overflow judgment

one sign bit

two sign bits (modulo four's complement)

Only one bit is stored when storing

If the two bits are the same, there is no overflow.

Two different people

Positive overflow

negative overflow

Multiplication of fixed-point numbers

Original code one-bit multiplication

Hand calculation

Similar to decimal, calculate directly

computer calculation

Calculate A×B

Illustration

The multiplicand A is stored in the X general register

Store multiplier B in MQ

The calculation results of each round are stored in ACC

process

The multiplicand and multiplier take the absolute value to participate in the operation The sign bit is processed separately (XOR operation)

① When is the number in ACC added to the number in general register X? Check whether the lowest bit of MQ is 1. If it is 1, ACC and X are added, and the result is placed in ACC

(ACC) (X) → ACC

② After storing the new result in ACC, perform logical right shift MQ logical right shift, ACC logical right shift The high bit of ACC is filled with 0, and the number shifted to the right from the low bit of ACC is moved to the high bit of MQ. The low bit of MQ is moved

One's complement multiplication

Hand calculation

computer calculation

Auxiliary bit - MQ lowest bit = 1, (ACC) (x) complement

Auxiliary bit - MQ lowest bit = 0, (ACC) 0

Auxiliary bit - MQ lowest bit = -1, (ACC) (-x) complement

process

Two's complement one-bit multiplication using two sign bits

Perform n rounds of addition, shifting, and finally add again.

Each shift is an arithmetic right shift

The sign bit participates in the operation

The auxiliary bit is after the lowest bit of MQ and is initially 0 The extra number from the MQ right shift replaces the initial 0 or other number of the original auxiliary bit.

The addition and shifting processes are the same as the one-bit multiplication of the original code.

Fixed point division

Original code division (remainder recovery method)

a÷b

ACC stores dividend a, remainder

MQ storage quotient (initially all 0)

X stores divisorb

process

First, the default quotient is 1, if an error occurs, it is changed to 0, and the remainder is restored.

If the quotient is 0, perform ACC (b) complement and put the result in ACC

Quotient 1, perform ACC (-b) supplement

If the sign bit of the calculated result is 1, it means that the quotient is wrong, then restore

After the addition operation and the new result is obtained, ACC and MQ are logically shifted to the left, and the low bits of MQ are filled with 0

Repeat the above operation

Original code division (alternating addition and subtraction method)

Just started, Complement a (-b) to get the new remainder

If the remainder is negative, the quotient is 0, and the remainder is logically shifted to the left and then (b) complemented

If the remainder is positive, then the quotient is 1. The remainder is logically shifted to the left and then (-b) is added.

last step If the remainder is negative, the quotient is 0, (b) complement to get the correct remainder

Two's complement division (alternating addition and subtraction)

The sign bit participates in the operation, double sign bit

Just started Determine whether the dividend and divisor have the same sign, If they have the same sign, dividend - divisor If there are different signs, the dividend is the divisor.

Follow-up Remainder and divisor have the same sign Quotient 1, remainder and quotient (ACC MQ) shift left, subtract divisor Remainder and divisor have different signs Quotient 0, remainder and quotient shifted left, add divisor

Data storage arrangement

big endian storage

Same as people’s reading habits

little endian storage

on the contrary

border alignment

integer type

int type

4B 32bit

long

4B 32bit

short type

2B 16bit

char type

1B 8bit

floating point number

Representation format

number sign

code

Represented by frameshift

mantissa

Expressed in decimal form of original code

Standardization

Zuogui

Shift the mantissa one position to the left and decrease the exponent by one.

right rule

Shift the mantissa one position to the right and add one to the exponent.

IEEE754

32-bit single precision

constitute

1 digit number

Expand code (frame code) 8 bits

Expand true value = frame code - offset value 127

Code range 1~254

23 digits of mantissa

Offset value 127

conversion process

① Number symbols

② mantissa

③ The true value of the code

④ Frame code (frame code) = code truth value - offset value

⑤ The mantissa is moved according to the true value of the exponent

example

The code cannot be all 0s or 1s

64-bit double precision

1 digit number

Code 11 digits

52 bits of mantissa

Offset value 1023

Floating point addition and subtraction

① Right order

The small steps are aligned with the big steps

The small mantissa of the exponent code is shifted one position to the right, the exponent is 1

② Sum of mantissas

③ Standardization

right rule

When 1×.××, right gauge

Shift the mantissa right and add one to the exponent

Zuogui

When 0.0, left rule

Shift the mantissa left and decrease the exponent by 1

Normalized number

The high bit of the mantissa of the original code is not 0, and the high bit of the mantissa of the complement code is different from the number symbol.

complement

Single symbol normalization

0.1xxx 1.0xxx

Two-symbol normalization

00.1xxx 11.0xxx

④ Rounding

Rounding occurs when aligning or right-handing

0 rounding method

Similar to rounding

constant set 1 method

Regardless of whether the highest bit is 1 or 0, set the end of the mantissa after the right shift to 1.

⑤ Overflow judgment

For positive underflow and negative underflow, the computer treats it as 0

cast

char→int→long→double

float→double

third chapter Storage System

Memory classification

Classified by level

cache

main memory

auxiliary storage

Sort by media

magnetic surface

tape

disk

magnetic core

semiconductor

Classified by access method

RAM - random access memory

ROM - read only memory

Modern memories can be erased electrically

serial access memory

Classification by information saveability

Volatile when powered off

RAM

non-volatile

ROM

magnetic surface

Memory performance metrics

storage

Number of words stored × word length

Storage speed

Access time

The period of time from initiation of access to completion of access

storage period

Access time Recovery period

main memory bandwidth

data transfer rate

The maximum number of messages in and out of main memory per second

multi-level storage system

CPU-cache-main memory-auxiliary memory

Data transfer before CPU and cache is done by hardware (invisible)

The connection between main memory and secondary memory is completed by the hardware and operating system (not visible to the application programmer)

main memory

random access memory

RAM

cache is implemented by SRAM

SRAM

static random access memory

bistable flip-flop

non-destructive readout

High cost, fast speed, low integration

Main memory is implemented by DRAM

DRAM

dynamic random access memory

Gate capacitance

Have to refresh every once in a while

Centralized refresh

Use a fixed time to refresh the capacitor

dead time

Scattered refresh

Spread the refresh of each row into various cycles

no dead zone

Asynchronous refresh

Refresh period divided by number of rows Refresh every once in a while

Maximum interval time 2ms

There is time to die

Low cost, slow speed, high integration

Address pin multiplexing

ROM

Features

Simple structure

non-volatile

Classification

MROM mask pattern read-only memory

Unchangeable content

PROM one-time programmable read-only memory

Once written, it cannot be changed

EPROM erasable programmable read-only memory

Rewriteable, limited programming times, long writing time

flash memory (flash memory)

Long-term storage of information

Quick erase rewrite online

CDROM

CD-ROM

SSD solid state drive

Long-term storage of information

Quick erase, rewrite

parallel memory

dual port memory

spatial parallelism

multi-body parallel memory

time parallelism

High-order crossover (sequential mode)

serial access

sequential memory

Low-order interleaving (interleaving addressing mode)

The high-order internal address is sent to the low-order module for decoding.

Pipeline approach

Access cycle bit T for accessing a word The bus transmission cycle is r

Number of cross modules ≥ T/r

The time required to access m words continuously is T (m-1)/r

Main memory and CPU connection

The data line is bidirectional Address lines are one-way

Capacity expansion

word expansion

2/4 decoder function: Chip Select

bit extension

The chip select signal cs should be linked to all chips

Memory to CPU connection

High chip select line, low address line

① Address line

② Data cable

③ IO line

④ Chip select line

External storage

disk

Minimum reading unit, one sector

Features

① Low cost and large capacity

②Long-term storage

③Non-destructive readout

Classification

disk storage

storage area

Magnetic head (number of recording sides)

One recording surface corresponds to one magnetic head

cylinder

How many tracks are there on each platter?

sector

How many sectors are there on each track?

Located on the same sector → all data can be read out in one memory access

Performance

average access time

seek time

The time it takes for the head to move to the destination track

delay

The time it takes for the magnetic head to locate the read and write sectors

Transmission time

Time to transfer data

Query sector time

seek time/2

data transfer rate

revolutions per second × capacity of each track n bits

Read and write operations are serial

Disk Array

RAID0

No redundancy, no checksum

RAID1

Mirror disk array

mutual backup

RAID2

Error correcting Hamming code disk array

RAID3

bit cross parity check

RAID4

block cross parity

RAID5

Parity check without independent verification

SSD flash memory

No different from U disk

Random writing is slow

There will be wear and tear

cache cache

working principle

The CPU issues a memory access request, If the main memory physical address is in the cache, it is hit and the cache is read directly. If the cache misses, it still needs to access the main memory and load it into the cache from the main memory.

Cache block length, also known as line length

Invisible to all programmers

Mapping method of cache and main memory

direct mapping

The lowest hit rate and the shortest time required

set associative mapping

The tag and group number form the main memory block number

Assuming that each group has r cache lines, it is called r-way set associative r acts as a group

The number of comparators in the cache is also r

Cache capacity (bit) = number of rows × (data per row, valid bits, dirty bits, replacement control bits, flag bits)

The unit of each row of data is bit

Dirty bits (consistent maintenance)

Write back method

replace control bit

When using the replacement algorithm, this bit When using random replacement strategy, this bit is not available

replacement algorithm

① Random algorithm

② First in, first out algorithm

Replace the oldest line first

③ Least recently used algorithm (Principle of locality) LRU

Replace recently unvisited rows

④ LFU is least commonly used

fully associative mapping

The highest hit rate and the longest time required

cache write strategy

①

Full writing method (direct writing)

High security

CPU write hit on cache

Data must be written to main memory and cache at the same time When a certain block needs to be replaced, directly overwrite it with the new block

write buffer

In order to reduce the time loss of writing directly to main memory

The CPU writes data to the cache and write buffer at the same time. Write the buffer and then write the contents to main memory

non-write allocation method

CPU write miss to cache

Writing to main memory does not perform block adjustment

②

Write back method (write back method)

for intensive

CPU write hit on cache

Only writes data to cache, not to main memory immediately Only writes to main memory when this block is swapped out

write assignment method

CPU write miss to cache

Load the main memory block into the cache, and then update the cache block

Disadvantage: Every time a miss occurs, a block must be read from main memory.

virtual memory

Logical address (virtual address) for user programming The main memory unit is the physical address (real address)

Virtual address = virtual memory page number within page

Real address = main memory page number within page

The CPU uses virtual memory addresses to find out the mapping relationship between virtual and real addresses through auxiliary hardware. If it is in main memory, through address conversion If it is not in the main memory, it is paged into the main memory and accessed by the CPU. If the main memory is full, the replacement algorithm is used

Missing hits will have a great impact on system performance.

Use fully associative mapping to improve hit rate

Write back method

Invisible to application programmers, visible to system programmers

Classification

page memory

in pages

page table

Find the corresponding page table entry based on the virtual page number in the high bits of the virtual address.

Fast table TLB

Consists of associative memory

Content addressing

Use group associative or fully associative

TLB tag

Fully associated virtual page number

Set-associative virtual page number high bit

multi-level storage system

The relationship between TLB, page and cache

The TLB stores a partial copy of the page

The cache stores a copy of a portion of main memory

If the TLB hits, the page must hit. If the TLB is missing, the page may still hit. If the page is missing, both TLB and cache must be missing.

Cache misses are done by hardware Page missing is completed by software (operating system page missing exception handler) TLB deletion can be done in both hardware and software

segmented memory

Segment number, segment address, composition

Segment table records: ① segment first address ② loading bit ③ segment length

segmented memory

With page as the basic transmission unit

① Section number ② Page number within the section ③ Address within the page

Chapter Four instruction

Instruction set architecture ISA

Main content: instruction format, data type and format, operand storage method, number of registers accessible to the program and their numbers Storage space size and addressing mode, addressing mode, instruction execution control mode, etc.

Command format

Operation code OP Address code A

The instruction word length is an integer multiple of bytes

Classification

zero address instruction

Only opcode op

Operational instructions: no operation, shutdown, shutdown interrupt Arithmetic instructions are used in stack computers (the operands come from the top of the stack and the top of the second stack)

one address command

Single-operand instruction with only destination operand

The result is saved back to the original address

Two-operand instructions with implicitly agreed destination operands

Another operand is provided by ACC through implicit addressing agreement, The operation result is stored in ACC

Two address instructions

Give the destination operand and source operand

The result is saved to the destination operand address

Three address instructions

Give the destination operand, source operand and result

Access memory 4 times: fetch instruction once, fetch operand 2 times, store result once

Four address instructions

op, destination operand, source operand, result, next address

The operation code is 8 bits, and the 4 address codes are 6 bits each.

Extended opcode instruction format

① The short opcode cannot be the same as the previous part of the long opcode

② The operation codes of each instruction are not repeated.

Extension mode

All 1s are reserved for the next address instruction expansion.

Operation type

① Data transmission

② Arithmetic and logical operations

③ Shift operation

④ Transfer operation

unconditional transfer

execute in any case

conditional transfer

Execute under specific conditions

The difference between transfer instructions and call instructions

The calling instruction will save the next instruction address (return address) The transfer instruction does not return to execution

⑤ Input and output operations

Program control instructions

unconditional transfer

conditional transfer

subroutine call

Return command

Loop instructions

Privileged instructions

For operating systems and system software

Not available to users

Instruction addressing mode

instruction addressing

sequential addressing

PC Program Counter 1, forms the next instruction

skip addressing

Implemented through transfer instructions

The current instruction modifies the PC value, and the next instruction is still given through the PC.

Data addressing

Classification

implicit addressing

The other operand of a single-address instruction can be obtained through implicit addressing, derived from ACC

program specification

immediate addressing

The address field directly gives the operand itself, using two's complement representation.

Convenient

Direct addressing

EA=A

The number of bits in A determines the addressing range of the operand

Reduced command length

1 access

Indirect addressing

EA = (A)

One indirect address requires two memory accesses

Facilitates expansion of addressing range

EA = Ri

Directly give the register number

The execution phase does not access main memory, only registers.

The address code length is small (shortened)

EA = (Ri)

Ri is not an operand, but the address of the main memory unit where the operation is located.

Need to access memory

Operands are in main memory

relative addressing

EA = (PC) A

A is the displacement relative to the current PC value, expressed in complement

for transfer instructions

1 access

base addressing

EA = (BR) A

Used for multi-channel design, BR is the base address register (visible)

1 access

indexed addressing

Effective address = formal address A index register IX EA = (IX)A

User-oriented

Dealing with array problems

1 access

Stack addressing

last in first out

Machine level code representation

Assembly format

AT&T format

The first is the source operand, the second is the destination operand, The direction is from left to right, which is natural

Registers are prefixed with %, and immediate numbers are prefixed with $.

Memory addressing uses ()

intel format

The first is the destination operand, the second is the source operand, Direction from right to left

Registers and immediate numbers do not need to be prefixed

Memory addressing uses [ ]

Compared

Common commands

Intel format

looking from right to left

Data transfer class

mov instruction

Copy a value to another

eg: mov eax, ebx Copy ebx value to eax

push instruction

push to stack

pop command

pop

Arithmetic and logical operations

add/sub command

Addition and subtraction

Save the result to the first number

eg: sub eax, 10 eax - 10 → eax

inc/dec instructions (increment/decrement)

Self-increasing and self-decreasing

imul directive (multiplication)

Multiplication of symbolic numbers

The result is stored in the first operand, which must be a register

eg: imul eax, [var] eax × [var] → eax

eg: imul esi, edx, 25 25 × edx → esi

idiv directive (division)

Division of symbolic numbers

There is only one operand, the divisor

eg: idiv ebx

and/or/xor instructions

AND, OR, XOR

The result is placed in the first operand

eg: and eax, 0 fH The first 28 bits of eax are 0, and the last 4 bits remain unchanged.

not instruction

bit flip

0→1, 1→0

neg instruction (negative)

Take the negative

eg: neg eax -eax→eax

shl/shr command (shift)

Logical shift left, logical shift right

The first operand represents the execution object The second number represents the number of shifts

eg: shl eax, 1 eax logically shifts left by 1 bit shr ebx, cl ebx logically shifts right by n bits (n is the value in cl)

control flow class

jmp command

transfer instruction

jcondition instruction

conditional transfer instructions

cmp/test command

cmp comparison value size test performs a bitwise AND operation on the operands

call/ret instruction

Subroutine call and return

CISC and RISC

CISC

complex command system

RISC

Streamlined command system

Compared

chapter Five CPU

CPU structure

operator

Arithmetic Logic Operation Unit ALU

Arithmetic and logical operations

scratchpad

Temporarily store data read from main memory

Accumulation register ACC

Temporarily store ALU result information

General register X

Allows users to program freely and can store data and addresses.

Same as machine word length

visible

Program status word register PSW

Keep various status information of the results

visible

shifter

Perform shift operations on operands or operation results

Counter CT

Control the number of steps for multiplication and division operations

controller

program counter PC

Indicates the storage address in main memory of the instruction to be executed.

The number of bits is the same as the number of memory address bits Memory address depends on storage capacity

visible

Instruction register IR

Save the currently executing command

The same length as the instruction word length

cannot be replaced by general purpose registers

instruction decoder

Decode the opcode

Invisible

memory address register MAR

Store the address of the main memory unit

Invisible

Memory Data Register MDR

Store information written to or read from main memory

Invisible

timing system

Used to generate various timing signals

Signal generator

CPU function

The controller is responsible for coordinating and controlling the instruction sequence (fetching, analysis, execution) of the program executed by each component of the computer. Calculators process data

① Instruction control, completing the operations of fetching, analyzing, and executing instructions

② Operation control

③ Time control

④ Data processing

⑤ Interrupt processing

Instruction execution process

instruction cycle

The time it takes to fetch and execute an instruction

Represented by several machine cycles

machine cycle

Fixed length

Unfixed length

Instruction fetch, indirect address, execution, interrupt

fetch cycle

According to the PC content, the instruction code is retrieved from the main memory and placed in the IR

The PC stores the address of the instruction. According to this address, the instruction is fetched from the corresponding memory unit and placed in the IR.

PC 1 while fetching

Instructions are fetched automatically by the machine

indirect address cycle

Get the effective address of the operand

Send the address code of the instruction into MAR and into the address bus. cu issues a read command to obtain the effective address, and finally stores it in MDR.

Two accesses

execution cycle

Take the operand and produce the result through the ALU operation according to the opcode of the instruction word of IR

interrupt cycle

Handle interrupt request

action plan

single instruction cycle

multiple instruction cycles

Pipeline solution

data path

Function

Data transmission path among functional components

Describes where the information starts, which register or multiplexer it passes through, and which register it is finally transmitted to.

controlled by control unit

basic structure

CPU internal single bus mode

CPU internal multi-bus mode

The input and output ports of all registers are connected to multiple common paths

Dedicated data path approach

Read the finger

(PC)→MAR 1→R MEM (MAR)→MDR (MDR)→IR

(MDR)→MAR 1→R MEM (MAR)→MDR (MDR)→Y (ACC) (Y)→Z (Z)→ACC

controller

hardwired controller

control unit

The opcode field of the instruction is the input signal to the control unit

FR (Flag Register)

Feedback information from the execution unit

Timing

beat generator

Generate machine period signals and beat signals

The control unit also accepts control signals from the system bus, such as interrupts, DMA

Micro operations

Multiple micro-operations can be completed in one machine cycle

fetch

(PC)→MAR 1→R M(MAR)→MDR (MDR)→IR (PC) 1→PC

indirect address

Ad(IR)→MAR 1→R M(MAR)→MDR

implement

non-fetch

CLA

Clear ACC

CoM

Negate

SHR

arithmetic right shift

CSL

Cycle left

STP

shutdown

Access

addition Ad(IR)→MAR M(MAR)→MDR (MDR) (ACC)→ACC STA X store instruction Ad(IR)→MAR, 1→write memory write (ACC)→MDR (MDR)→M(MAR) LDA X fetch instruction Ad(IR)→MAR，1→R M(MAR)→MDR (MDR)→ACC

transfer

JMP X(unconditional transfer) Ad(IR)→PC BAN X (conditional transfer)

control method

Synchronous control method

unified clock

Asynchronous control mode

Each component works at its own inherent speed

joint control method

Most use synchronous, a few use asynchronous

microprogrammed controller

Use storage logic to code micro-operation signals

A machine instruction is written as a microprogram, and each microprogram contains multiple microinstructions. A machine instruction can be decomposed into a sequence of micro-operations (the most basic and cannot be further divided)

Normally, one microprogram cycle corresponds to one instruction cycle

It is to ensure the synchronization of the entire machine control signal.

Microinstructions are stored in the control storage unit

Microcommands are the control signals of microoperations, and microoperations are the execution processes of microcommands.

Main memory is used to store data and programs, and is implemented outside the CPU using RAM. The control memory CM is used to store microprograms. It is implemented with ROM (EPROM) inside the CPU and is accessed according to the address of the microinstruction.

structure

work process

fetch microinstructions

From the opcode field of the machine instruction, the micro-address forming component generates the microprogram entry address corresponding to the machine instruction and sends it to CMAR.

Fetch the corresponding microinstructions one by one from the CM and execute them

Return to the microprogram entry address and repeat the operation

Encoding

direct control

Field direct encoding method

The microcommand field is divided into several fields, Mutually exclusive microcommands are placed in the same field, and compatibility commands are placed in different fields.

All 0 means no operation

Field indirect encoding method

A microcommand for one field is interpreted by a microcommand for another field, Shorten microinstruction word length

Microinstruction address formation method

Judgment method

Given by the address under the microinstruction

Opcodes formed based on machine instructions

incremental count

sign law

cyber law

hardware law

Microinstruction format

horizontal microinstructions

Strong parallel operation capability

Short execution time

vertical microinstructions

Only one basic operation can be performed in parallel

long execution time

hybrid microinstructions

Compared

Exceptions and interrupts

abnormal

Unexpected events generated inside the CPU are called exceptions (internal interrupts)

Such as: hardware failure interrupt (memory check error, bus error) Programmatic exceptions (division by 0, overflow, breakpoint, single-step tracing, illegal instruction, stack overflow, address out of bounds, page fault)

Internal interrupt is a non-maskable interrupt

Abnormal detection is completed by the CPU itself

Classification

Programmed interrupt (software interrupt)

Fault

Missing section

Missing page

Load the required segments or pages from disk into main memory and return to the failed instruction to continue execution.

Illegal opcode

Divisor is 0

Failure cannot be recovered through the exception handler (the handler still needs to be called), execution cannot be returned to the breakpoint, and the process is terminated directly.

trap oneself

Breakpoint settings for program debugging. When these instructions are executed, Automatically call out the operating system kernel program for execution unconditionally or conditionally

Terminate (hardware interrupt)

Controller error, memory check error, etc.

Call the interrupt service routine and restart the system

External interrupt (hardware interrupt)

A device external to the CPU that issues an interrupt request to the CPU is called an interrupt (external interrupt)

For example: IO interrupt issued by IO device (keyboard input, printer out of paper) Special event (the user presses the esc key and the timer reaches the time)

The CPU must obtain interrupt source information through the interrupt request line

Classification

Maskable interrupt (low priority)

Interrupt request issued to the CPU through the maskable interrupt request line INTR, The CPU can mask or not mask by setting the corresponding mask word in the interrupt controller. Masked interrupt requests are not sent to the CPU

Non-maskable interrupt (high priority)

For example: power outage

Interrupt request issued to the CPU through the dedicated non-maskable interrupt request line NMI

response process

The entire response process cannot be interrupted

Turn off interrupts

Disable corresponding new interrupts by setting the interrupt enable bit IF flip-flop.

Save breakpoints and program state

In order to be able to return to the interrupted place to continue execution after exception and interrupt processing is completed.

Identify exceptions and interrupts and go to the appropriate program

Identification method

Software identification

The CPU sets an exception status register to record the cause of the exception. Query the exception status register in priority order and then go to the corresponding handler in the kernel

Hardware identification

vectored interrupt

The first address of an exception or interrupt is called an interrupt vector. All interrupt vectors are placed in the interrupt vector table. Each interrupt or exception corresponds to an interrupt type number.

instruction pipeline

definition

An instruction is divided into multiple stages, each stage is completed by the corresponding functional component

Require

The instruction length should be the same Regular command format Using load/store instructions, no other registers can access the memory. Data and instructions are aligned in memory

Assembly line adventure

① Structure adventure

Multiple instructions compete for the same resource at the same time

Solution 1) Suspend subsequent instructions 2) Set up data memory and instruction memory separately

② Data Adventure

The next instruction will use the calculation result of the current instruction, and the two instructions will cause data conflict.

read after write

read and write

write after write

Solution 1) Suspend instructions and subsequent instructions that encounter data-related conflicts for one or several cycles 2) Set up relevant data paths (data bypass technology), and quickly send the operation results to the register as soon as they are obtained 3) Adjust the order of instructions

③ Control the adventure

Transfer instructions, calls, returns, etc. will change the PC value and cause flow interruption.

Solution 1) Predict branches for transfer instructions and generate transfer target addresses as early as possible 2) Prefetch target instructions in two control flow directions, successful and unsuccessful.

Pipeline performance indicators

Pipeline throughput

Pipeline speedup

Advanced pipeline technology

superscalar pipeline

Dynamic multi-emission technology

Combined with dynamic pipeline scheduling technology, through dynamic branch prediction and other means Multiple independent instructions can be executed concurrently per clock cycle

Very long instruction word technology

Static multi-emission technology

Combine multiple instructions that can operate in parallel into one very long instruction word with multiple opcodes

Super pipeline technology

Functional segments are divided into more

multiprocessor

Single instruction stream single data stream SISD

serial

One processor and one memory

sequential execution

Single instruction stream multiple data stream SIMD

Data parallel technology

vector processor

Multiple instruction single data stream MISD

does not exist

Multiple Instruction Multiple Data Streams MIMD

parallel computing

Hardware multi-threading

Fine-grained multithreading

Multi-threading takes turns and cross-execution

Similar to time slice rotation

Coarse-grained multithreading

Only switch threads when one thread has a large overhead

Simultaneous multi-threading

In the same clock cycle, multiple instructions in multiple different threads are issued for execution.

shared memory multiprocessor (SMP)

Sharing a physical address space, you can run programs independently in your own virtual address space

Unified storage access UMA

How the processor is connected to shared memory

Non-uniform storage access NUMA

Combinational logic circuits and sequential logic circuits

Combinational logic circuits do not have unified clock control Sequential logic circuits must work under clock ticks

Combinational logic circuits do not contain memory cells to store signals Sequential logic circuits contain memory cells that store signals

Chapter Six bus

A public information transmission line that can be shared by multiple components in a time-shared manner

Send in time, receive at the same time

Bus classification

On-chip bus

The bus inside the chip

system bus

address bus

A one-way bus used by the CPU to select the main memory unit address and IO port address, and cannot be transmitted back.

Data Bus

Transmit data information, two-way line

control bus

I/O bus

Connect low to medium speed IO devices

data cable

Transfer data buffer register, command/status register contents

address line

Port address for transmitting data exchanged with CPU

control line

Send read and write signals

communication bus

external bus

A bus that transmits information between computer systems or between computer systems and other systems

System bus structure

Single bus structure

Dual bus structure

The main memory bus is used to transfer data between the CPU, main memory and channels The IO bus is used to transfer data between external devices and channels

Three bus structure

main memory bus IO bus DMA bus

bus standard

ISA

Industry standard architecture

EISA

extended isa for 32-bit CPUs

VESA

Video Electronics Standards Association 32-bit local bus

PCI

External device interconnection

High-performance 32 or 64-bit bus

plug and play

Belongs to local bus

AGP

accelerated graphics port

Belongs to local bus

PCI-E

The latest bus interface standards

serial

Replace PCI, AGP

RS-232C

serial communication bus

USB

universal serial bus

PCMCIA

Laptop interface

IDE

integrated device circuit

SCSI

small computer system interface

SATA

serial advanced technology

Performance

Bus transfer cycle

Consists of several bus clock cycles

Bus clock cycle

Bus operating frequency

= clock frequency/N

bus clock frequency

The reciprocal of the bus clock period

bus bandwidth

= Bus operating frequency × (bus width/8)

bus transaction

ask

arbitration

addressing

transmission

freed

Burst transmission (burst transmission)

Send address first and then data You only need to transmit the address once, and subsequent data is sent continuously.

Timing mode

Synchronous timing mode

The system uses a unified clock signal

High transmission efficiency

Poor reliability

Asynchronous timing mode

Timing by handshake signal

No interlocking mode

Semi-interlocking method

Full interlocking method

Bus arbitration control (bus arbitration)

centralized

chain

circuit design

The device closest to the bus has the highest priority

Easy to expand equipment

Sensitive to circuit faults

Use only two wires to determine which bus a device uses

Counter timing

circuit design

Counting starts from 0

Fixed priority

Counting starts from the end point

Priorities are equal

The counter value is set by the program

variable priority

Complex control

Take log₂n lines

independent request

circuit design

quick response

Flexible priority control

Bus control is complicated

Take 2n lines

distributed

Control logic is dispersed across various devices connected to the bus

Chapter VII input Output

IO system

IO hardware

IO devices are connected to the mainboard system bus through the device controller

external device

input device

keyboard mouse

output device

monitor

screen size

resolution

Grayscale

The difference in color, the more gray levels, the more realistic the image

8-bit (256 levels) → 256 colors

refresh

refresh rate

display memory

VRAM capacity = resolution × number of grayscale bits VRAM bandwidth = resolution × number of grayscale bits × frame rate

printer

Classification by working method

dot matrix printer

Dot matrix printer

Inkjet Printers

laser printer

In a computer, a Chinese character internal code occupies 2B in the main memory.

interface

A logical device that coordinates the transfer of data between peripherals and the host

input device

output device

External storage

IO software

driver

User program

management program

IO control method

Program query method

The CPU continuously checks whether the IO device is ready through the program

Once the CPU starts IO, it must stop the running of the current program.

Features

The CPU has a stepping wait phenomenon

Keep repeating the same operation until complete

CPU and IO serial work

Program interrupt mode

When the IO device is ready, issue an interrupt request to the CPU

Function

CPU and IO parallelism

Handling hardware failures, software errors

human-computer interaction

Multiple programs, time-sharing operations

work process

Interrupt response occurs at the end of an instruction execution

interrupt request

Interrupt response arbitration

Non-maskable interrupt ＞ Internal exception ＞ Maskable interrupt

Interrupt response conditions

The interrupt source has an interrupt request

CPU open interrupt

An instruction is executed

interrupt vector

The entry address of the interrupt program

Determine interrupt type

After the CPU responds to the interrupt, it obtains the interrupt type number by identifying the interrupt source and calculates the address of the corresponding interrupt vector. Then according to this address, the entry address of the interrupt service program is taken out from the interrupt vector table, and then sent to the PC, and then the interrupt service program is executed.

response priority

Done via hardware queue

processing priority

Dynamically adjust using interrupt masking technology

interrupt mask word

Interrupt handling process

multiple interrupts

DMA mode direct memory access

Direct data path, between main memory and IO devices

There is a direct data path between main memory and the DMA interface

logical path

A method of controlling information transmission entirely by hardware

During data transfer, the determination of the main memory address is directly completed by the hardware circuit.

The main memory must open a dedicated buffer to provide and receive peripheral data in a timely manner.

DMA needs to be pre-processed by the program before transmission, and post-processed by interrupt after the transmission is completed.

The DMA response occurs after a bus transaction, and the DMA has greater control than the CPU.

DMA transfer method

1) Stop CPU access to memory

2) Periodic misappropriation

3) DMA and CPU alternately access memory

DMA transfer process

1) Preprocessing takes up CPU The CPU completes the necessary preparations

Among them, DMA cannot directly contact the user program and uses the device driver as an intermediary.

2) Data transmission does not occupy CPU Fully controlled by DMA hardware

3) Post-processing takes up CPU DMA controller sends interrupt request to CPU

channel mode

When the host executes an IO command, it starts the relevant channel, executes the channel program, and completes the IO operation.

Channel processor, a coprocessor specially used for IO management, with interrupt, DMA, and program control functions

The channel program is in main memory, executed by the channel, and can only be executed in an IO system with a channel

Most efficient

IO interface (IO controller)

Function

Address decoding, device selection

Communication between host and peripherals

data buffer

Signal format conversion

Transmit control commands and status information

basic structure

The IO interface is connected to the memory and CPU through the IO bus on the host side.

Ports are registers used for reading and writing in interface circuits. Multiple ports and control logic together form an interface

The IO instruction is a privileged instruction that is used by the underlying IO software of the operating system kernel.

type

Divided by data transmission method

serial

parallel

Divided by control method

program interface

Interrupt interface

DMA interface

Divided by function

Programmable

Not programmable

IO port

Memory directly accessed by the CPU in the interface circuit

Addressing

Unified addressing

memory mapping

Distinguish from address code

Treat IO ports as storage units for address allocation

Use unified memory access instructions to access IO ports

Independent addressing

IO mapping method

Set up special IO instructions to access

When executing an instruction, the CPU uses address lines to select the IO port Use data lines to transfer data between CPU registers and IO ports