
















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
CS203 Advanced Computer Architecture
Typology: Lecture notes
1 / 56
This page cannot be seen from the preview
Don't miss anything!
Instructions 1f
Data int main(){ printf(“Hello, world!\n”); }
Instructions 1f
Data Instruction Fetch Arithmetic Logical Units (ALU) Complex Arithmetic Operations (Mul/div) Branch/ Jump Memory Operations Instruction Decode Program Counter Registers
By loading different programs into memory, your computer can perform different functions
https://en.wikipedia.org/wiki/Pareto_principle^5 Top 10% own 67% of the wealth in the U.S. 80% of users use only 20% of features
You only need to know 2% English words to understand 90% of conversations
8 https://www.anandtech.com/show/16143/insights-into-ddr5-subtimings-and-latencies
Data Rate MT/s Bandwidth GB/s
(clk) Latency (ns) Year Data Rate MT/s Bandwidth GB/s
(clk) Latency (ns) Year 100 0.80 3 24.00 1992 400 3.20 5 25.00 1998 133 1.07 3 22.50 667 5.33 5 15. 800 6.40 6 15. DDR 2 DDR 3 400 3.20 5 25.00 2003 800 6.40 6 15.00 2007 667 5.33 5 15.00 1066 8.53 8 15. 800 6.40 6 15.00 1333 10.67 9 13. 1600 12.80 11 13. 1866 14.93 13 13. 2133 17.07 14 13. DDR 4 DDR 5 1600 12.80 11 13.75 2014 3200 25.60 22 13.75 2020 1866 14.93 13 13.92 3600 28.80 26 14. 2133 17.07 15 14.06 4000 32.00 28 14. 2400 19.20 17 14.17 4400 35.20 32 14. 2666 21.33 19 14.25 4800 38.40 34 14. 2933 23.46 21 14.32 5200 41.60 38 14. 3200 25.20 22 13.75 5600 44.80 40 14. 6000 48.00 42 14. 6400 51.20 46 14.
The “latency” gap between CPU and DRAM Ratio 0 20 40 60 80 Latency (ns) 0 5 10 15 20 25 30 1992 1998 2003 2007 2014 2020 CPU DRAM DRAM/CPU Ratio CPU Model i486 Pentium II Pentium 4 Core 2 Core i7-4790K Core i5-10600K DRAM Standa
Instructions 1f
Data int main(){ printf(“Hello, world!\n”); }
Instructions 1f
Data Instruction Fetch Arithmetic Logical Units (ALU) Complex Arithmetic Operations (Mul/div) Branch/ Jump Memory Operations Instruction Decode Program Counter Registers
Fetching instruction is 50x slower than other CPU operations! Even worse when your instruction needs to access data — another 50+ cycles
Speedupmax( f 1 , ∞) = 1 ( 1 − f 1 ) Speedupmax( f 2 , ∞) = 1 ( 1 − f 2 ) Speedupmax( f 3 , ∞) = 1 ( 1 − f 3 ) Speedupmax( f 4 , ∞) = 1 ( 1 − f 4 ) Speedup max ( f, ∞) = 1 ( 1 − f ) Speedup parallel ( f parallelizable , ∞) = 1 ( 1 − fparallelizable) Speedup parallel ( f parallelizable , ∞) = 1 ( 1 − fparallelizable) Speedup enhanced ( f, s, r) = 1 ( 1 − f ) + perf(r) + f s
ProcessorProcessor
21
Processor Core Registers larger fastest < 1ns tens of ns ens of us 32 or 64 words a few ns KBs ~ MBs GBs TBs
= 0.25ns Each $ access =
= 2 cycles Each DRAM access =
= 55 cycles CPI average = 1 + 100 % × [ 2 + ( 1 − 90 %) × 55 ] + 20 % × [ 2 + ( 1 − 90 %) × 55 ] = 10 cycles
1 − 90 %