MindMap Gallery Mind map of computer composition principles
This is a mind map about the composition principles of 02 computers, including an overview, data operations, storage systems, central processing units, etc.
Edited at 2023-12-08 15:47:38One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
Computer composition principles
Chapter One Overview
computer system hierarchy
hardware
von Neumann computer
stored procedure
control flow driver
Programs and data compiled in advance are sent to the main memory Computer executes item by item
uniprocessor
Instructions and data have the same status
Instruction data are all binary codes
Operator-centered
modern computer
memory-centric
structure
input device
output device
memory
Main memory (internal storage)
memory bank
Auxiliary storage (external storage)
CPU
MAR (address memory)
The number of bits corresponds to the number of storage units
Store access address
MDR(data memory)
The number of bits is equal to the storage word length
Temporarily stores information read and written from memory
operator
ALU (Arithmetic Logic Unit)
combinational logic circuit
ACC (Accumulator)
MQ (Multiplier Quotient Register)
X general purpose register
Temporary storage of operands and intermediate results
IX (index register)
BE (Base Address Register)
PSW (Program Status Word Register/Flag Register)
Store the flag information obtained by the operation, such as: overflow, carry and borrow
controller
PC (Program Counter)
Stores the address of the instruction currently to be executed. After execution, it is automatically incremented by one to form the address of the next instruction.
Number of word length digits→Number of memory words
IR (Instruction Register)
Used to store the current command
Content from MDR
Instruction word length
CU (Control Unit)
software
composition
system software
operating systemOS
Database management system DBMS
language processing system
Network software system
service program
application
Daily use software
language
machine language
Computer can be executed directly
binary code
Assembly language
English words and abbreviation letters
Convert to machine language before execution
high level language
Java, c, c etc.
Source program to executable file
multi-level hierarchy
Interpreter program: Translate and execute at the same time
Performance
word length
Is an integer multiple of bytes (1B or 8bit)
machine word length
Data path width within the CPU for integer arithmetic
ALU long digits
Number of general register bits
Instruction word length
The number of binary code bits in an instruction word
Storage word length
Binary code length of one memory unit
data path bandwidth
The number of bits transmitted in parallel on the data bus at one time
Main memory capacity
calculating speed
Throughput
Number of requests processed per unit time
Main frequency
CPU clock frequency
How many HZ is it generally?
1hz means once per second
1GHZ = 10⁹
How many clock cycles per second
CPU clock cycle
Countdown of main frequency
1/main frequency
How many seconds is a clock cycle?
CPI
The number of clock cycles it takes to execute an instruction
How many clock cycles does it take to execute an instruction?
related to three factors
Instruction Set
Programming (system structure)
Computer organization (architecture)
CPU execution time
(number of instructions × CPI) / main frequency
MIPS
How many million (M) instructions are executed per second?
Main frequency / (CPI×10⁶)
FLOPS
mflops
How many millions of floating point operations are performed per second?
Number of floating point operations / (execution time × 10⁶)
gflops
billion
10⁹
tflops
trillions
10¹²
pflops
Quadrillions
10¹⁵
zflops
Transparency
Invisible content
For high-level language programmers, Instruction format, data operation process, etc. are invisible content
For machine language or assembly language programmers, Instruction format, machine structure, and data format are visible content
Programmers cannot view the contents of MAR, MDR, and IR in the CPU
Chapter two Data operations
base method
binary
The base is 2
Octal
Every eight moves into one
The trick: three-digit binary numbers
decimal
hexadecimal
123456789ABCDEF
Tip ①: Four-digit binary number
Tip ②: A represents 10
BCD code
8421 codes
Four binary digits represent a decimal number
Correction method:
① If the result of the two-digit operation is less than or equal to the decimal number 9, no correction will be made.
② If the result of the two-digit operation is greater than or equal to the decimal number 10, add the decimal number 6 (0110)
Three more yards
Add 3 to the 8421 code (0011) In this way, every number has a remainder of three
2421 codes
From the highest position to the lowest position, they are 2421 Such as: 5=(0101)
coded representation
Original code
0 means not unique
The word length is n 1, the range of the signed original code -(2ⁿ-1)≤ x ≤ 2ⁿ-1
reverse code
A positive number Original code=reverse code
negative number The sign bit remains unchanged and the value is inverted.
complement
0 means unique
A positive number Original code=reverse code=complement code
negative number Original code inverse code conversion: the original code is from right to left, find the first 1, invert all the numerical bits to the left of this 1, that is, get the complement code
Word length is n 1, signed complement range -2ⁿ ≤ x ≤ 2ⁿ-1
Stored in two’s complement form in C language
frameshift
0 means unique
exponent representing a floating point number
Offset
Negate the complement sign bit → frameshift
fixed point shift
arithmetic shift
For symbolic numbers
The sign bit remains unchanged when shifting
positive arithmetic shift Fill in 0 for original denies and supplements
negative arithmetic shift Fill in 0 for the original code The complement code is filled with 0 for the left shift and 1 for the right shift. Fill in the reverse code with 1
logical shift
For unsigned numbers
Logical left shift, high bit is lost, low bit is filled with zeros
Logical right shift, low bits are lost, high bits are filled with zeros
Addition and subtraction of fixed point numbers
Two's complement addition and subtraction
Addition direct operation
Subtraction A complement (-B) complement
Identifier
unsigned number
CF carry borrow
Determine unsigned number overflow
addition carry
Determine subtraction borrowing
Small - Large Borrowed seats available
ZF Zero mark
The result is 0, then ZF=1
signed number
ZF zero mark
SF Symbol Flag
The symbol that represents the result
OF overflow flag
Determine overflow of signed numbers
Overflow judgment
one sign bit
two sign bits (modulo four's complement)
Only one bit is stored when storing
If the two bits are the same, there is no overflow.
Two different people
01
Positive overflow
10
negative overflow
Multiplication of fixed-point numbers
Original code one-bit multiplication
Hand calculation
Similar to decimal, calculate directly
computer calculation
Calculate A×B
Illustration
The multiplicand A is stored in the X general register
Store multiplier B in MQ
The calculation results of each round are stored in ACC
process
The multiplicand and multiplier take the absolute value to participate in the operation The sign bit is processed separately (XOR operation)
① When is the number in ACC added to the number in general register X? Check whether the lowest bit of MQ is 1. If it is 1, ACC and X are added, and the result is placed in ACC
(ACC) (X) → ACC
② After storing the new result in ACC, perform logical right shift MQ logical right shift, ACC logical right shift The high bit of ACC is filled with 0, and the number shifted to the right from the low bit of ACC is moved to the high bit of MQ. The low bit of MQ is moved
One's complement multiplication
Hand calculation
computer calculation
Auxiliary bit - MQ lowest bit = 1, (ACC) (x) complement
Auxiliary bit - MQ lowest bit = 0, (ACC) 0
Auxiliary bit - MQ lowest bit = -1, (ACC) (-x) complement
process
Two's complement one-bit multiplication using two sign bits
Perform n rounds of addition, shifting, and finally add again.
Each shift is an arithmetic right shift
The sign bit participates in the operation
The auxiliary bit is after the lowest bit of MQ and is initially 0 The extra number from the MQ right shift replaces the initial 0 or other number of the original auxiliary bit.
The addition and shifting processes are the same as the one-bit multiplication of the original code.
Fixed point division
Original code division (remainder recovery method)
a÷b
ACC stores dividend a, remainder
MQ storage quotient (initially all 0)
X stores divisorb
process
First, the default quotient is 1, if an error occurs, it is changed to 0, and the remainder is restored.
If the quotient is 0, perform ACC (b) complement and put the result in ACC
Quotient 1, perform ACC (-b) supplement
If the sign bit of the calculated result is 1, it means that the quotient is wrong, then restore
After the addition operation and the new result is obtained, ACC and MQ are logically shifted to the left, and the low bits of MQ are filled with 0
Repeat the above operation
Original code division (alternating addition and subtraction method)
Just started, Complement a (-b) to get the new remainder
If the remainder is negative, the quotient is 0, and the remainder is logically shifted to the left and then (b) complemented
If the remainder is positive, then the quotient is 1. The remainder is logically shifted to the left and then (-b) is added.
last step If the remainder is negative, the quotient is 0, (b) complement to get the correct remainder
Two's complement division (alternating addition and subtraction)
The sign bit participates in the operation, double sign bit
Just started Determine whether the dividend and divisor have the same sign, If they have the same sign, dividend - divisor If there are different signs, the dividend is the divisor.
Follow-up Remainder and divisor have the same sign Quotient 1, remainder and quotient (ACC MQ) shift left, subtract divisor Remainder and divisor have different signs Quotient 0, remainder and quotient shifted left, add divisor
Data storage arrangement
big endian storage
Same as people’s reading habits
little endian storage
on the contrary
border alignment
integer type
int type
4B 32bit
long
4B 32bit
short type
2B 16bit
char type
1B 8bit
floating point number
Representation format
number sign
code
Represented by frameshift
mantissa
Expressed in decimal form of original code
Standardization
Zuogui
Shift the mantissa one position to the left and decrease the exponent by one.
right rule
Shift the mantissa one position to the right and add one to the exponent.
IEEE754
32-bit single precision
constitute
1 digit number
Expand code (frame code) 8 bits
Expand true value = frame code - offset value 127
Code range 1~254
23 digits of mantissa
Offset value 127
conversion process
① Number symbols
② mantissa
③ The true value of the code
④ Frame code (frame code) = code truth value - offset value
⑤ The mantissa is moved according to the true value of the exponent
example
The code cannot be all 0s or 1s
64-bit double precision
1 digit number
Code 11 digits
52 bits of mantissa
Offset value 1023
Floating point addition and subtraction
① Right order
The small steps are aligned with the big steps
The small mantissa of the exponent code is shifted one position to the right, the exponent is 1
② Sum of mantissas
③ Standardization
right rule
When 1×.××, right gauge
Shift the mantissa right and add one to the exponent
Zuogui
When 0.0, left rule
Shift the mantissa left and decrease the exponent by 1
Normalized number
The high bit of the mantissa of the original code is not 0, and the high bit of the mantissa of the complement code is different from the number symbol.
complement
Single symbol normalization
0.1xxx 1.0xxx
Two-symbol normalization
00.1xxx 11.0xxx
④ Rounding
Rounding occurs when aligning or right-handing
0 rounding method
Similar to rounding
constant set 1 method
Regardless of whether the highest bit is 1 or 0, set the end of the mantissa after the right shift to 1.
⑤ Overflow judgment
For positive underflow and negative underflow, the computer treats it as 0
cast
char→int→long→double
float→double
third chapter Storage System
Memory classification
Classified by level
cache
main memory
auxiliary storage
Sort by media
magnetic surface
tape
disk
magnetic core
semiconductor
CD
Classified by access method
RAM - random access memory
ROM - read only memory
Modern memories can be erased electrically
serial access memory
Classification by information saveability
Volatile when powered off
RAM
non-volatile
ROM
magnetic surface
CD
Memory performance metrics
storage
Number of words stored × word length
Storage speed
Access time
The period of time from initiation of access to completion of access
storage period
Access time Recovery period
main memory bandwidth
data transfer rate
The maximum number of messages in and out of main memory per second
multi-level storage system
CPU-cache-main memory-auxiliary memory
Data transfer before CPU and cache is done by hardware (invisible)
The connection between main memory and secondary memory is completed by the hardware and operating system (not visible to the application programmer)
main memory
random access memory
RAM
cache is implemented by SRAM
SRAM
static random access memory
bistable flip-flop
non-destructive readout
High cost, fast speed, low integration
Main memory is implemented by DRAM
DRAM
dynamic random access memory
Gate capacitance
Have to refresh every once in a while
Centralized refresh
Use a fixed time to refresh the capacitor
dead time
Scattered refresh
Spread the refresh of each row into various cycles
no dead zone
Asynchronous refresh
Refresh period divided by number of rows Refresh every once in a while
Maximum interval time 2ms
There is time to die
Low cost, slow speed, high integration
Address pin multiplexing
ROM
ROM
Features
Simple structure
non-volatile
Classification
MROM mask pattern read-only memory
Unchangeable content
PROM one-time programmable read-only memory
Once written, it cannot be changed
EPROM erasable programmable read-only memory
Rewriteable, limited programming times, long writing time
flash memory (flash memory)
Long-term storage of information
Quick erase rewrite online
CDROM
CD-ROM
SSD solid state drive
Long-term storage of information
Quick erase, rewrite
parallel memory
dual port memory
spatial parallelism
multi-body parallel memory
time parallelism
High-order crossover (sequential mode)
serial access
sequential memory
Low-order interleaving (interleaving addressing mode)
The high-order internal address is sent to the low-order module for decoding.
Pipeline approach
Access cycle bit T for accessing a word The bus transmission cycle is r
Number of cross modules ≥ T/r
The time required to access m words continuously is T (m-1)/r
Main memory and CPU connection
The data line is bidirectional Address lines are one-way
Capacity expansion
word expansion
2/4 decoder function: Chip Select
bit extension
The chip select signal cs should be linked to all chips
Memory to CPU connection
High chip select line, low address line
① Address line
② Data cable
③ IO line
④ Chip select line
External storage
disk
Minimum reading unit, one sector
Features
① Low cost and large capacity
②Long-term storage
③Non-destructive readout
Classification
disk storage
storage area
Magnetic head (number of recording sides)
One recording surface corresponds to one magnetic head
cylinder
How many tracks are there on each platter?
sector
How many sectors are there on each track?
Located on the same sector → all data can be read out in one memory access
Performance
average access time
seek time
The time it takes for the head to move to the destination track
delay
The time it takes for the magnetic head to locate the read and write sectors
Transmission time
Time to transfer data
Query sector time
seek time/2
data transfer rate
revolutions per second × capacity of each track n bits
Read and write operations are serial
Disk Array
RAID0
No redundancy, no checksum
RAID1
Mirror disk array
mutual backup
RAID2
Error correcting Hamming code disk array
RAID3
bit cross parity check
RAID4
block cross parity
RAID5
Parity check without independent verification
SSD flash memory
No different from U disk
Random writing is slow
There will be wear and tear
cache cache
working principle
The CPU issues a memory access request, If the main memory physical address is in the cache, it is hit and the cache is read directly. If the cache misses, it still needs to access the main memory and load it into the cache from the main memory.
Cache block length, also known as line length
Invisible to all programmers
Mapping method of cache and main memory
direct mapping
The lowest hit rate and the shortest time required
set associative mapping
The tag and group number form the main memory block number
Assuming that each group has r cache lines, it is called r-way set associative r acts as a group
The number of comparators in the cache is also r
Cache capacity (bit) = number of rows × (data per row, valid bits, dirty bits, replacement control bits, flag bits)
The unit of each row of data is bit
Dirty bits (consistent maintenance)
Write back method
replace control bit
When using the replacement algorithm, this bit When using random replacement strategy, this bit is not available
replacement algorithm
① Random algorithm
② First in, first out algorithm
Replace the oldest line first
③ Least recently used algorithm (Principle of locality) LRU
Replace recently unvisited rows
④ LFU is least commonly used
fully associative mapping
The highest hit rate and the longest time required
cache write strategy
①
Full writing method (direct writing)
High security
CPU write hit on cache
Data must be written to main memory and cache at the same time When a certain block needs to be replaced, directly overwrite it with the new block
write buffer
In order to reduce the time loss of writing directly to main memory
The CPU writes data to the cache and write buffer at the same time. Write the buffer and then write the contents to main memory
non-write allocation method
CPU write miss to cache
Writing to main memory does not perform block adjustment
②
Write back method (write back method)
for intensive
CPU write hit on cache
Only writes data to cache, not to main memory immediately Only writes to main memory when this block is swapped out
write assignment method
CPU write miss to cache
Load the main memory block into the cache, and then update the cache block
Disadvantage: Every time a miss occurs, a block must be read from main memory.
virtual memory
Logical address (virtual address) for user programming The main memory unit is the physical address (real address)
Virtual address = virtual memory page number within page
Real address = main memory page number within page
The CPU uses virtual memory addresses to find out the mapping relationship between virtual and real addresses through auxiliary hardware. If it is in main memory, through address conversion If it is not in the main memory, it is paged into the main memory and accessed by the CPU. If the main memory is full, the replacement algorithm is used
Missing hits will have a great impact on system performance.
Use fully associative mapping to improve hit rate
Write back method
Invisible to application programmers, visible to system programmers
Classification
page memory
in pages
page table
Find the corresponding page table entry based on the virtual page number in the high bits of the virtual address.
Fast table TLB
Consists of associative memory
Content addressing
Use group associative or fully associative
TLB tag
Fully associated virtual page number
Set-associative virtual page number high bit
multi-level storage system
The relationship between TLB, page and cache
The TLB stores a partial copy of the page
The cache stores a copy of a portion of main memory
If the TLB hits, the page must hit. If the TLB is missing, the page may still hit. If the page is missing, both TLB and cache must be missing.
Cache misses are done by hardware Page missing is completed by software (operating system page missing exception handler) TLB deletion can be done in both hardware and software
segmented memory
Segment number, segment address, composition
Segment table records: ① segment first address ② loading bit ③ segment length
segmented memory
With page as the basic transmission unit
① Section number ② Page number within the section ③ Address within the page
Chapter Four instruction
Instruction set architecture ISA
Main content: instruction format, data type and format, operand storage method, number of registers accessible to the program and their numbers Storage space size and addressing mode, addressing mode, instruction execution control mode, etc.
Command format
Operation code OP Address code A
The instruction word length is an integer multiple of bytes
Classification
zero address instruction
Only opcode op
Operational instructions: no operation, shutdown, shutdown interrupt Arithmetic instructions are used in stack computers (the operands come from the top of the stack and the top of the second stack)
one address command
Single-operand instruction with only destination operand
The result is saved back to the original address
Two-operand instructions with implicitly agreed destination operands
Another operand is provided by ACC through implicit addressing agreement, The operation result is stored in ACC
Two address instructions
Give the destination operand and source operand
The result is saved to the destination operand address
Three address instructions
Give the destination operand, source operand and result
Access memory 4 times: fetch instruction once, fetch operand 2 times, store result once
Four address instructions
op, destination operand, source operand, result, next address
The operation code is 8 bits, and the 4 address codes are 6 bits each.
Extended opcode instruction format
① The short opcode cannot be the same as the previous part of the long opcode
② The operation codes of each instruction are not repeated.
Extension mode
All 1s are reserved for the next address instruction expansion.
Operation type
① Data transmission
② Arithmetic and logical operations
③ Shift operation
④ Transfer operation
unconditional transfer
execute in any case
conditional transfer
Execute under specific conditions
The difference between transfer instructions and call instructions
The calling instruction will save the next instruction address (return address) The transfer instruction does not return to execution
⑤ Input and output operations
Program control instructions
unconditional transfer
conditional transfer
subroutine call
Return command
Loop instructions
Privileged instructions
For operating systems and system software
Not available to users
Instruction addressing mode
instruction addressing
sequential addressing
PC Program Counter 1, forms the next instruction
skip addressing
Implemented through transfer instructions
The current instruction modifies the PC value, and the next instruction is still given through the PC.
Data addressing
Classification
implicit addressing
The other operand of a single-address instruction can be obtained through implicit addressing, derived from ACC
program specification
immediate addressing
The address field directly gives the operand itself, using two's complement representation.
Convenient
Direct addressing
EA=A
The number of bits in A determines the addressing range of the operand
Reduced command length
1 access
Indirect addressing
EA = (A)
One indirect address requires two memory accesses
Facilitates expansion of addressing range
Register addressing
EA = Ri
Directly give the register number
The execution phase does not access main memory, only registers.
The address code length is small (shortened)
register indirect addressing
EA = (Ri)
Ri is not an operand, but the address of the main memory unit where the operation is located.
Need to access memory
Operands are in main memory
relative addressing
EA = (PC) A
A is the displacement relative to the current PC value, expressed in complement
for transfer instructions
1 access
base addressing
EA = (BR) A
Used for multi-channel design, BR is the base address register (visible)
1 access
indexed addressing
Effective address = formal address A index register IX EA = (IX)A
User-oriented
Dealing with array problems
1 access
Stack addressing
last in first out
Machine level code representation
Assembly format
AT&T format
The first is the source operand, the second is the destination operand, The direction is from left to right, which is natural
Registers are prefixed with %, and immediate numbers are prefixed with $.
Memory addressing uses ()
intel format
The first is the destination operand, the second is the source operand, Direction from right to left
Registers and immediate numbers do not need to be prefixed
Memory addressing uses [ ]
Compared
Common commands
Intel format
looking from right to left
Data transfer class
mov instruction
Copy a value to another
eg: mov eax, ebx Copy ebx value to eax
push instruction
push to stack
pop command
pop
Arithmetic and logical operations
add/sub command
Addition and subtraction
Save the result to the first number
eg: sub eax, 10 eax - 10 → eax
inc/dec instructions (increment/decrement)
Self-increasing and self-decreasing
imul directive (multiplication)
Multiplication of symbolic numbers
The result is stored in the first operand, which must be a register
eg: imul eax, [var] eax × [var] → eax
eg: imul esi, edx, 25 25 × edx → esi
idiv directive (division)
Division of symbolic numbers
There is only one operand, the divisor
eg: idiv ebx
and/or/xor instructions
AND, OR, XOR
The result is placed in the first operand
eg: and eax, 0 fH The first 28 bits of eax are 0, and the last 4 bits remain unchanged.
not instruction
bit flip
0→1, 1→0
neg instruction (negative)
Take the negative
eg: neg eax -eax→eax
shl/shr command (shift)
Logical shift left, logical shift right
The first operand represents the execution object The second number represents the number of shifts
eg: shl eax, 1 eax logically shifts left by 1 bit shr ebx, cl ebx logically shifts right by n bits (n is the value in cl)
control flow class
jmp command
transfer instruction
jcondition instruction
conditional transfer instructions
cmp/test command
cmp comparison value size test performs a bitwise AND operation on the operands
call/ret instruction
Subroutine call and return
CISC and RISC
CISC
complex command system
RISC
Streamlined command system
Compared
chapter Five CPU
CPU structure
operator
Arithmetic Logic Operation Unit ALU
Arithmetic and logical operations
scratchpad
Temporarily store data read from main memory
Accumulation register ACC
Temporarily store ALU result information
General register X
Allows users to program freely and can store data and addresses.
Same as machine word length
visible
Program status word register PSW
Keep various status information of the results
visible
shifter
Perform shift operations on operands or operation results
Counter CT
Control the number of steps for multiplication and division operations
controller
program counter PC
Indicates the storage address in main memory of the instruction to be executed.
The number of bits is the same as the number of memory address bits Memory address depends on storage capacity
visible
Instruction register IR
Save the currently executing command
The same length as the instruction word length
cannot be replaced by general purpose registers
instruction decoder
Decode the opcode
Invisible
memory address register MAR
Store the address of the main memory unit
Invisible
Memory Data Register MDR
Store information written to or read from main memory
Invisible
timing system
Used to generate various timing signals
Signal generator
CPU function
The controller is responsible for coordinating and controlling the instruction sequence (fetching, analysis, execution) of the program executed by each component of the computer. Calculators process data
① Instruction control, completing the operations of fetching, analyzing, and executing instructions
② Operation control
③ Time control
④ Data processing
⑤ Interrupt processing
Instruction execution process
instruction cycle
The time it takes to fetch and execute an instruction
Represented by several machine cycles
machine cycle
Fixed length
Unfixed length
Instruction fetch, indirect address, execution, interrupt
fetch cycle
According to the PC content, the instruction code is retrieved from the main memory and placed in the IR
The PC stores the address of the instruction. According to this address, the instruction is fetched from the corresponding memory unit and placed in the IR.
PC 1 while fetching
Instructions are fetched automatically by the machine
indirect address cycle
Get the effective address of the operand
Send the address code of the instruction into MAR and into the address bus. cu issues a read command to obtain the effective address, and finally stores it in MDR.
Two accesses
execution cycle
Take the operand and produce the result through the ALU operation according to the opcode of the instruction word of IR
interrupt cycle
Handle interrupt request
action plan
single instruction cycle
multiple instruction cycles
Pipeline solution
data path
Function
Data transmission path among functional components
Describes where the information starts, which register or multiplexer it passes through, and which register it is finally transmitted to.
controlled by control unit
basic structure
CPU internal single bus mode
CPU internal multi-bus mode
The input and output ports of all registers are connected to multiple common paths
Dedicated data path approach
Read the finger
(PC)→MAR 1→R MEM (MAR)→MDR (MDR)→IR
(MDR)→MAR 1→R MEM (MAR)→MDR (MDR)→Y (ACC) (Y)→Z (Z)→ACC
controller
hardwired controller
control unit
IR
The opcode field of the instruction is the input signal to the control unit
FR (Flag Register)
Feedback information from the execution unit
Timing
beat generator
Generate machine period signals and beat signals
The control unit also accepts control signals from the system bus, such as interrupts, DMA
Micro operations
Multiple micro-operations can be completed in one machine cycle
fetch
(PC)→MAR 1→R M(MAR)→MDR (MDR)→IR (PC) 1→PC
indirect address
Ad(IR)→MAR 1→R M(MAR)→MDR
implement
non-fetch
CLA
Clear ACC
CoM
Negate
SHR
arithmetic right shift
CSL
Cycle left
STP
shutdown
Access
addition Ad(IR)→MAR M(MAR)→MDR (MDR) (ACC)→ACC STA X store instruction Ad(IR)→MAR, 1→write memory write (ACC)→MDR (MDR)→M(MAR) LDA X fetch instruction Ad(IR)→MAR,1→R M(MAR)→MDR (MDR)→ACC
transfer
JMP X(unconditional transfer) Ad(IR)→PC BAN X (conditional transfer)
control method
Synchronous control method
unified clock
Asynchronous control mode
Each component works at its own inherent speed
joint control method
Most use synchronous, a few use asynchronous
microprogrammed controller
Use storage logic to code micro-operation signals
A machine instruction is written as a microprogram, and each microprogram contains multiple microinstructions. A machine instruction can be decomposed into a sequence of micro-operations (the most basic and cannot be further divided)
Normally, one microprogram cycle corresponds to one instruction cycle
It is to ensure the synchronization of the entire machine control signal.
Microinstructions are stored in the control storage unit
Microcommands are the control signals of microoperations, and microoperations are the execution processes of microcommands.
Main memory is used to store data and programs, and is implemented outside the CPU using RAM. The control memory CM is used to store microprograms. It is implemented with ROM (EPROM) inside the CPU and is accessed according to the address of the microinstruction.
structure
work process
fetch microinstructions
From the opcode field of the machine instruction, the micro-address forming component generates the microprogram entry address corresponding to the machine instruction and sends it to CMAR.
Fetch the corresponding microinstructions one by one from the CM and execute them
Return to the microprogram entry address and repeat the operation
Encoding
direct control
Field direct encoding method
The microcommand field is divided into several fields, Mutually exclusive microcommands are placed in the same field, and compatibility commands are placed in different fields.
All 0 means no operation
Field indirect encoding method
A microcommand for one field is interpreted by a microcommand for another field, Shorten microinstruction word length
Microinstruction address formation method
Judgment method
Given by the address under the microinstruction
Opcodes formed based on machine instructions
incremental count
sign law
cyber law
hardware law
Microinstruction format
horizontal microinstructions
Strong parallel operation capability
Short execution time
vertical microinstructions
Only one basic operation can be performed in parallel
long execution time
hybrid microinstructions
Compared
Exceptions and interrupts
abnormal
Unexpected events generated inside the CPU are called exceptions (internal interrupts)
Such as: hardware failure interrupt (memory check error, bus error) Programmatic exceptions (division by 0, overflow, breakpoint, single-step tracing, illegal instruction, stack overflow, address out of bounds, page fault)
Internal interrupt is a non-maskable interrupt
Abnormal detection is completed by the CPU itself
Classification
Programmed interrupt (software interrupt)
Fault
Missing section
Missing page
Load the required segments or pages from disk into main memory and return to the failed instruction to continue execution.
Illegal opcode
Divisor is 0
Failure cannot be recovered through the exception handler (the handler still needs to be called), execution cannot be returned to the breakpoint, and the process is terminated directly.
trap oneself
Breakpoint settings for program debugging. When these instructions are executed, Automatically call out the operating system kernel program for execution unconditionally or conditionally
Terminate (hardware interrupt)
Controller error, memory check error, etc.
Call the interrupt service routine and restart the system
External interrupt (hardware interrupt)
A device external to the CPU that issues an interrupt request to the CPU is called an interrupt (external interrupt)
For example: IO interrupt issued by IO device (keyboard input, printer out of paper) Special event (the user presses the esc key and the timer reaches the time)
The CPU must obtain interrupt source information through the interrupt request line
Classification
Maskable interrupt (low priority)
Interrupt request issued to the CPU through the maskable interrupt request line INTR, The CPU can mask or not mask by setting the corresponding mask word in the interrupt controller. Masked interrupt requests are not sent to the CPU
Non-maskable interrupt (high priority)
For example: power outage
Interrupt request issued to the CPU through the dedicated non-maskable interrupt request line NMI
response process
The entire response process cannot be interrupted
Turn off interrupts
Disable corresponding new interrupts by setting the interrupt enable bit IF flip-flop.
Save breakpoints and program state
In order to be able to return to the interrupted place to continue execution after exception and interrupt processing is completed.
Identify exceptions and interrupts and go to the appropriate program
Identification method
Software identification
The CPU sets an exception status register to record the cause of the exception. Query the exception status register in priority order and then go to the corresponding handler in the kernel
Hardware identification
vectored interrupt
The first address of an exception or interrupt is called an interrupt vector. All interrupt vectors are placed in the interrupt vector table. Each interrupt or exception corresponds to an interrupt type number.
instruction pipeline
definition
An instruction is divided into multiple stages, each stage is completed by the corresponding functional component
Require
The instruction length should be the same Regular command format Using load/store instructions, no other registers can access the memory. Data and instructions are aligned in memory
Assembly line adventure
① Structure adventure
Multiple instructions compete for the same resource at the same time
Solution 1) Suspend subsequent instructions 2) Set up data memory and instruction memory separately
② Data Adventure
The next instruction will use the calculation result of the current instruction, and the two instructions will cause data conflict.
read after write
read and write
write after write
Solution 1) Suspend instructions and subsequent instructions that encounter data-related conflicts for one or several cycles 2) Set up relevant data paths (data bypass technology), and quickly send the operation results to the register as soon as they are obtained 3) Adjust the order of instructions
③ Control the adventure
Transfer instructions, calls, returns, etc. will change the PC value and cause flow interruption.
Solution 1) Predict branches for transfer instructions and generate transfer target addresses as early as possible 2) Prefetch target instructions in two control flow directions, successful and unsuccessful.
Pipeline performance indicators
Pipeline throughput
Pipeline speedup
Advanced pipeline technology
superscalar pipeline
Dynamic multi-emission technology
Combined with dynamic pipeline scheduling technology, through dynamic branch prediction and other means Multiple independent instructions can be executed concurrently per clock cycle
Very long instruction word technology
Static multi-emission technology
Combine multiple instructions that can operate in parallel into one very long instruction word with multiple opcodes
Super pipeline technology
Functional segments are divided into more
multiprocessor
Single instruction stream single data stream SISD
serial
One processor and one memory
sequential execution
Single instruction stream multiple data stream SIMD
Data parallel technology
vector processor
Multiple instruction single data stream MISD
does not exist
Multiple Instruction Multiple Data Streams MIMD
parallel computing
Hardware multi-threading
Fine-grained multithreading
Multi-threading takes turns and cross-execution
Similar to time slice rotation
Coarse-grained multithreading
Only switch threads when one thread has a large overhead
Simultaneous multi-threading
In the same clock cycle, multiple instructions in multiple different threads are issued for execution.
shared memory multiprocessor (SMP)
Sharing a physical address space, you can run programs independently in your own virtual address space
Unified storage access UMA
How the processor is connected to shared memory
Non-uniform storage access NUMA
Combinational logic circuits and sequential logic circuits
Combinational logic circuits do not have unified clock control Sequential logic circuits must work under clock ticks
Combinational logic circuits do not contain memory cells to store signals Sequential logic circuits contain memory cells that store signals
Chapter Six bus
A public information transmission line that can be shared by multiple components in a time-shared manner
Send in time, receive at the same time
Bus classification
On-chip bus
The bus inside the chip
system bus
address bus
A one-way bus used by the CPU to select the main memory unit address and IO port address, and cannot be transmitted back.
Data Bus
Transmit data information, two-way line
control bus
I/O bus
Connect low to medium speed IO devices
data cable
Transfer data buffer register, command/status register contents
address line
Port address for transmitting data exchanged with CPU
control line
Send read and write signals
communication bus
external bus
A bus that transmits information between computer systems or between computer systems and other systems
System bus structure
Single bus structure
Dual bus structure
The main memory bus is used to transfer data between the CPU, main memory and channels The IO bus is used to transfer data between external devices and channels
Three bus structure
main memory bus IO bus DMA bus
bus standard
ISA
Industry standard architecture
EISA
extended isa for 32-bit CPUs
VESA
Video Electronics Standards Association 32-bit local bus
PCI
External device interconnection
High-performance 32 or 64-bit bus
plug and play
Belongs to local bus
AGP
accelerated graphics port
Belongs to local bus
PCI-E
The latest bus interface standards
serial
Replace PCI, AGP
RS-232C
serial communication bus
USB
universal serial bus
PCMCIA
Laptop interface
IDE
integrated device circuit
SCSI
small computer system interface
SATA
serial advanced technology
Performance
Bus transfer cycle
Consists of several bus clock cycles
Bus clock cycle
Bus operating frequency
= clock frequency/N
bus clock frequency
The reciprocal of the bus clock period
bus bandwidth
= Bus operating frequency × (bus width/8)
bus transaction
ask
arbitration
addressing
transmission
freed
Burst transmission (burst transmission)
Send address first and then data You only need to transmit the address once, and subsequent data is sent continuously.
Timing mode
Synchronous timing mode
The system uses a unified clock signal
High transmission efficiency
Poor reliability
Asynchronous timing mode
Timing by handshake signal
No interlocking mode
Semi-interlocking method
Full interlocking method
Bus arbitration control (bus arbitration)
centralized
chain
circuit design
The device closest to the bus has the highest priority
Easy to expand equipment
Sensitive to circuit faults
Use only two wires to determine which bus a device uses
Counter timing
circuit design
Counting starts from 0
Fixed priority
Counting starts from the end point
Priorities are equal
The counter value is set by the program
variable priority
Complex control
Take log₂n lines
independent request
circuit design
quick response
Flexible priority control
Bus control is complicated
Take 2n lines
distributed
Control logic is dispersed across various devices connected to the bus
Chapter VII input Output
IO system
IO hardware
IO devices are connected to the mainboard system bus through the device controller
external device
input device
keyboard mouse
output device
monitor
screen size
resolution
Grayscale
The difference in color, the more gray levels, the more realistic the image
8-bit (256 levels) → 256 colors
refresh
refresh rate
display memory
VRAM capacity = resolution × number of grayscale bits VRAM bandwidth = resolution × number of grayscale bits × frame rate
printer
Classification by working method
dot matrix printer
Dot matrix printer
Inkjet Printers
laser printer
In a computer, a Chinese character internal code occupies 2B in the main memory.
interface
A logical device that coordinates the transfer of data between peripherals and the host
input device
output device
External storage
IO software
driver
User program
management program
IO control method
Program query method
The CPU continuously checks whether the IO device is ready through the program
Once the CPU starts IO, it must stop the running of the current program.
Features
The CPU has a stepping wait phenomenon
Keep repeating the same operation until complete
CPU and IO serial work
Program interrupt mode
When the IO device is ready, issue an interrupt request to the CPU
Function
CPU and IO parallelism
Handling hardware failures, software errors
human-computer interaction
Multiple programs, time-sharing operations
work process
Interrupt response occurs at the end of an instruction execution
interrupt request
Interrupt response arbitration
Non-maskable interrupt > Internal exception > Maskable interrupt
Interrupt response conditions
The interrupt source has an interrupt request
CPU open interrupt
An instruction is executed
interrupt vector
The entry address of the interrupt program
Determine interrupt type
After the CPU responds to the interrupt, it obtains the interrupt type number by identifying the interrupt source and calculates the address of the corresponding interrupt vector. Then according to this address, the entry address of the interrupt service program is taken out from the interrupt vector table, and then sent to the PC, and then the interrupt service program is executed.
response priority
Done via hardware queue
processing priority
Dynamically adjust using interrupt masking technology
interrupt mask word
Interrupt handling process
multiple interrupts
DMA mode direct memory access
Direct data path, between main memory and IO devices
There is a direct data path between main memory and the DMA interface
logical path
A method of controlling information transmission entirely by hardware
During data transfer, the determination of the main memory address is directly completed by the hardware circuit.
The main memory must open a dedicated buffer to provide and receive peripheral data in a timely manner.
DMA needs to be pre-processed by the program before transmission, and post-processed by interrupt after the transmission is completed.
The DMA response occurs after a bus transaction, and the DMA has greater control than the CPU.
DMA transfer method
1) Stop CPU access to memory
2) Periodic misappropriation
3) DMA and CPU alternately access memory
DMA transfer process
1) Preprocessing takes up CPU The CPU completes the necessary preparations
Among them, DMA cannot directly contact the user program and uses the device driver as an intermediary.
2) Data transmission does not occupy CPU Fully controlled by DMA hardware
3) Post-processing takes up CPU DMA controller sends interrupt request to CPU
channel mode
When the host executes an IO command, it starts the relevant channel, executes the channel program, and completes the IO operation.
Channel processor, a coprocessor specially used for IO management, with interrupt, DMA, and program control functions
The channel program is in main memory, executed by the channel, and can only be executed in an IO system with a channel
Most efficient
IO interface (IO controller)
Function
Address decoding, device selection
Communication between host and peripherals
data buffer
Signal format conversion
Transmit control commands and status information
basic structure
The IO interface is connected to the memory and CPU through the IO bus on the host side.
Ports are registers used for reading and writing in interface circuits. Multiple ports and control logic together form an interface
The IO instruction is a privileged instruction that is used by the underlying IO software of the operating system kernel.
type
Divided by data transmission method
serial
parallel
Divided by control method
program interface
Interrupt interface
DMA interface
Divided by function
Programmable
Not programmable
IO port
Memory directly accessed by the CPU in the interface circuit
Addressing
Unified addressing
memory mapping
Distinguish from address code
Treat IO ports as storage units for address allocation
Use unified memory access instructions to access IO ports
Independent addressing
IO mapping method
Set up special IO instructions to access
When executing an instruction, the CPU uses address lines to select the IO port Use data lines to transfer data between CPU registers and IO ports