Page 1
Instruction Pip elining and Hazards F orm ula Sheet
Instruction Pip elining Basics
• Definition : Ov erlapping execution of instructions b y dividing in to stages (e.g., F etc h, Deco de,
Execute, Memory , W rite-Bac k).
• Pip eline Stages : k , t ypically 5 for RISC, eac h stage time T
stage
.
• Clo c k Cycle Time : T
cycle
= max(T
stage
i
)+T
setup
, where T
setup
is latc h o v erhead (e.g., 5-10 ns).
• Pip eline Throughput : IPS˜
f
1+ stalls/CPI
, w here f =
1
T
cycle
.
• Pip eline Latency : T
instr
=k·T
cycle
, for one ins truction.
• Ideal Sp eedup : S
pip eline
=k , compared to single-cycle (T
single
=
?
T
stage
i
).
• Execution Time : T
exec
= [(k+(N -1))·T
cycle
]+T
stall
, where N is n um b er of instructions.
Pip eline Hazards
• Structural Hazard : Resourc e conflict (e.g., single memory p ort).
– Probabilit y: P
structural
?
N
conflicts
N instr
.
– Stall Time: T
stall
=n
conflict
·T
cycle
, t ypically n
conflict
= 1 .
• Data Hazard : Dep endency b et w een instruction s (RA W, W AR, W A W).
– RA W (Read After W rite): T
stall
=d·T
cycle
, where d is dep endency distance (e.g., 1-2 cycles).
– Probabilit y: P
data
?
N
dep endencies
N instr
.
– F orw arding: Reduces T
stall
to 0 if data a v ailable p ost-ALU, T
forw ard
˜T
m ux
.
• Con trol Hazard : Branc h instruction disrupts pip eline flo w.
– Stall Time: T
stall
= (k
flush
-1)·T
cycle
, where k
flush
is stages flushed (e.g., 2-3).
– Branc h P enalt y: T
branc h
=P
branc h
·P
mispredict
·T
stall
, where P
branc h
is branc h frequency (e.g.,
0.2), P
mispredict
is mis prediction rate (e.g., 0.1).
Hazard Mitigation T ec hniques
• Structural Hazard Solutions :
– Separate Resources: e.g., Instruction/Data memory , S
mem
=S
inst
+S
data
.
– Stall: T
stall
=T
cycle
, lo w o v erhead but reduces throughput.
• Data Hazard Solutions :
– F orw arding/Bypassing: T
forw ard
˜T
m ux
, eliminates stall if data ready .
– Stalls: T
stall
=d·T
cycle
, used if forw arding not p ossible (e.g., m emory load).
– Compiler Sc heduling: Reorder instructions, T
reorder
=O(N) , reduces P
data
.
• Con trol Hazard Solutions :
– Branc h Prediction: Static/Dynamic, accuracy A
predict
= 1-P
mispredict
, reduce s T
branc h
.
– Dela y ed Branc h: Fill d slots with useful instructions, T
dela y
= 0 if no stall.
– Branc h T arget Buffer (BTB): T
BTB
˜T
cac he
, hit rate h
BTB
˜ 0.9 .
1
Page 2
Instruction Pip elining and Hazards F orm ula Sheet
Instruction Pip elining Basics
• Definition : Ov erlapping execution of instructions b y dividing in to stages (e.g., F etc h, Deco de,
Execute, Memory , W rite-Bac k).
• Pip eline Stages : k , t ypically 5 for RISC, eac h stage time T
stage
.
• Clo c k Cycle Time : T
cycle
= max(T
stage
i
)+T
setup
, where T
setup
is latc h o v erhead (e.g., 5-10 ns).
• Pip eline Throughput : IPS˜
f
1+ stalls/CPI
, w here f =
1
T
cycle
.
• Pip eline Latency : T
instr
=k·T
cycle
, for one ins truction.
• Ideal Sp eedup : S
pip eline
=k , compared to single-cycle (T
single
=
?
T
stage
i
).
• Execution Time : T
exec
= [(k+(N -1))·T
cycle
]+T
stall
, where N is n um b er of instructions.
Pip eline Hazards
• Structural Hazard : Resourc e conflict (e.g., single memory p ort).
– Probabilit y: P
structural
?
N
conflicts
N instr
.
– Stall Time: T
stall
=n
conflict
·T
cycle
, t ypically n
conflict
= 1 .
• Data Hazard : Dep endency b et w een instruction s (RA W, W AR, W A W).
– RA W (Read After W rite): T
stall
=d·T
cycle
, where d is dep endency distance (e.g., 1-2 cycles).
– Probabilit y: P
data
?
N
dep endencies
N instr
.
– F orw arding: Reduces T
stall
to 0 if data a v ailable p ost-ALU, T
forw ard
˜T
m ux
.
• Con trol Hazard : Branc h instruction disrupts pip eline flo w.
– Stall Time: T
stall
= (k
flush
-1)·T
cycle
, where k
flush
is stages flushed (e.g., 2-3).
– Branc h P enalt y: T
branc h
=P
branc h
·P
mispredict
·T
stall
, where P
branc h
is branc h frequency (e.g.,
0.2), P
mispredict
is mis prediction rate (e.g., 0.1).
Hazard Mitigation T ec hniques
• Structural Hazard Solutions :
– Separate Resources: e.g., Instruction/Data memory , S
mem
=S
inst
+S
data
.
– Stall: T
stall
=T
cycle
, lo w o v erhead but reduces throughput.
• Data Hazard Solutions :
– F orw arding/Bypassing: T
forw ard
˜T
m ux
, eliminates stall if data ready .
– Stalls: T
stall
=d·T
cycle
, used if forw arding not p ossible (e.g., m emory load).
– Compiler Sc heduling: Reorder instructions, T
reorder
=O(N) , reduces P
data
.
• Con trol Hazard Solutions :
– Branc h Prediction: Static/Dynamic, accuracy A
predict
= 1-P
mispredict
, reduce s T
branc h
.
– Dela y ed Branc h: Fill d slots with useful instructions, T
dela y
= 0 if no stall.
– Branc h T arget Buffer (BTB): T
BTB
˜T
cac he
, hit rate h
BTB
˜ 0.9 .
1
RISC Pip eline Characteristics
• Design: Fixed-length instructions (L
instr
= 32 bits), simple ops, CPI˜ 1 .
• Stage Balance : T
stage
i
˜T
ALU
˜T
mem
, minimizes T
cycle
.
• Load/Store Arc hitecture : Only LO AD/STORE access memory , T
mem
in one stage.
• Pip eline E?iciency : Throughput˜f , reduced b y hazards, T
stall
?P
hazard
.
• Branc h Handling : Early branc h resolution, T
branc h-resolv e
˜T
deco de
, minimizes flush.
V ector Pro cessing and Pip elining
• Definition : Single instruction op erates on v ector data, e.g., SIMD (Single Instruction, Multiple
Data).
• V ector Length : V , t ypically 4-64 elemen ts, pro c essed in parallel.
• Execution Time : T
v ector
=T
init
+?
V
P
?·T
cycle
, where P is pip eline width (e.g., 4).
• Sp eedup : S
v ector
˜P , limited b y v ector setup T
init
˜ 10-20 cycles.
• Memory Bandwidth : B
v ector
=V ·w·f , where w is elemen t width (e.g., 32 bits).
• Pip eline Depth : k
v ector
, deep er for v ector ops, T
cycle
?k
v ector
.
In terrupt Pro cessing in Pip elining
• In terrupt Latency : T
in terrupt
=T
flush
+T
handler-fetc h
+T
con text-sa v e
.
• Pip eline Flush : T
flush
= (k-1)·T
cycle
, cle ars in-fligh t instructions.
• Con text Sa v e Time : T
con text
=N
regs
·T
mem
, where N
regs
is registers sa v ed.
• In terrupt F requency : f
in t
, impacts throughput, IPS˜
f
1+f in t·T in terrupt
.
• Precise In terrupts : Ensure correct state, T
precise
˜T
flush
+T
commit
.
P erformance Metrics
• Effectiv e CPI : CPI
eff
= 1+
?
P
hazardi
·n
stalli
, where P
hazardi
is hazard probabilit y , n
stalli
is stall
cycles.
• Execution Time : T
exec
=N · CPI
eff
·T
cycle
, where N is instruction coun t.
• Throughput : IPS =
f
CPI
eff
, MIPS =
IPS
10
6
.
• Sp eedup (Pip elined vs. Single-Cycle) : S =
T
single
T
pip eline
˜
?
T stage
i
T
cycle
·(1+ stalls/CPI)
.
• Amdahl’s La w : S =
1
(1-P)+
P
k
, where P is parallelizable fraction, k is pip eline stages or v ector
width.
• Hazard Impact : T
hazard
=
?
(P
hazardi
·T
stalli
) , minimized b y forw arding, prediction.
Applications and Concepts
• Pip elining : Increases throughput in RISC pro cessors, critical for high f (GHz).
• Hazard Mitigation : F orw arding, branc h prediction, BTB reduce CPI
eff
.
• V ector Pro cessing : Used in SIMD, GPUs, S
v ector
?P , for arra y pro cessing.
• VLIW (V ery Long Instruction W ord) : Multiple ops p er instruction, T
cycle
? N
ops
, but
hazard-prone.
• In terrupt Handling : Critical for real-time systems, T
in terrupt
affects resp onsiv eness.
• Pip eline Conflicts : Managed b y c ompiler sc heduling, hardw are in terlo c ks, T
sc hedule
=O(N) .
2
Read More