Performance and MIPS
CPU Time for a program:
Geometric mean:
MIPS Cheat Sheet:
Register convention:
$zero - Always 0
$v0-1 - Result registers
$a0-3 - Argument registers
$t0-7 - Temporary registers
$s0-7 - Content registers, save for later use
$sp - Stack pointer
$ra - Return address
Arithmetic:
add $t0, $t0, $t1 #t0 = t0 + t1addi $t0, $t0, 5 #t0 = t0 + 5addu $t0, $t0, $t1 #t0 = t0 + t1 (unsigned)addiu $t0, $t0, 5 #t0 = t0 + 5 (unsigned)
sub $t0, $t0, $t1 #t0 = t0 - t1subi $t0, $t0, 5 #t0 = t0 - 5subu $t0, $t0, $t1 #t0 = t0 - t1 (unsigned)subiu $t0, $t0, 5 #t0 = t0 - 5 (unsigned)Memory:
lw $s0, 0($a0) # s0 = MEM[a0] (word)sw $s0, 0($a0) # MEM[a0] = $s0 (word)
l.d $s0, 0($a0) # s0, s1 = MEM[a0] (Double word)s.d $s0, 0($a0) # MEM[a0] = $s0, $s1 (Double word)Logical:
sll $t0, $t1, 2 # t0 := t1 * 4srl $t0, $t1, 2 # t0 := t1 / 4
and $t0, $t1, $t2 # t0 := t1 & t2andi $t0, $t1, 2 # t0 := t1 & 2
or $t0, $t1, $t2 # t0 := t1 | t2ori $t0, $t1, 2 # t0 := t1 | 2
nor $t0, $t1, $zero # := ~(t1)Conditionals:
beq rs, rt, L1 # if(rs == rt); Jump to L1bne rs, rt, L1 # if(rs != rt); Jump to L1j L1 # Jump to L1Remember:
- Incrementing with 1 and not 4!
- MIPS uses byte addressing
Pipeline
-
Fetch Instruction. PC → Instruction memory.
-
Decode the instruction and read from registers.
-
Execute the instruction.
- Arithmetic/logical computation.
- Computation of effective memory address.
- Computation of jump address/conditional address.
-
Read / Write from / to memory for load/store instructions.
-
Write back the result into the result register (RF).
- Structural hazards
- A stage is currently busy doing an operation
- Data hazards
- An instruction that depends on a earlier instruction.
- Control hazards
- The condition and potential address of a jump has not yet by the instruction fetch.
Caches
Average Memory Access Time (AMAT):
Prep for exam
- MIPS
- Calculate the amount of access calls to data cache (read and writes) and instruction cache.
- Pipleline
- TODO: Understand each stage in depth
- Learn how to do a pipeline diagram
- Iterate and go over optimizations and how to avoid stalls.
- Performance calculations
- Memory and cache
- Iterate and go over all the different types, and how to calculate size/amount of bits.
- Go over how to calculate how long time/amount of time it takes for certain misses, with certain sizes etc.
Notes
Page table size = Number of virtual pages Size of each page table entry
Number of virtual pages = Total virtual address space / Page size Number of virtual pages = (Number of processes Size of each virtual address space) / Page size
Cache Offset: If block size is B bytes, offset = log2(B).
Cache Index: Given by how many blocks/sets there are.
If direct mapped, meaning 1 block per set:
Number of blocks = Total Cache Size / Sizer per block
Number of sets = (Total Cache Size / Sizer per block) / Associativity level
Cache Index = log2(Number of blocks/sets)
Cache Tag: If A is the total amount of address bits, tag = A - offset - index.
A = tag + offset + index
TLB Index: If there are E pages, index = log2(E).
TLB Tag: Tag is the rest of bits. So, if we have V virtual address bits, tag = V - index.
Calculate the size of each block entry. = Number of status bits (valid, dirt etc). + Tag bits + Data bits (Usually block size (in bytes) * 8).
Calculate the size of the cache. = Number of status bits (valid, dirt etc). + Tag bits + Data bits (Usually block size (in bytes) * 8).