Download Computer Architecture: Technology and Moore's Law and more Slides Computer Networks in PDF only on Docsity!
CIS 501 (Martin/Roth): Technology 1
CIS 501
Computer Architecture
Unit 3: Technology
Slides originally developed by Amir Roth with contributions by Milo Martin at University of Pennsylvania with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi, Jim Smith, and David Wood. CIS 501 (Martin/Roth): Technology 2
This Unit
- Technology basis
- Transistors & wires
- Cost & fabrication
- Implications of transistor scaling (Moore’s Law) CIS 501 (Martin/Roth): Technology 3
Readings
- Chapter 1.1 of MA:FSPTCM
- Paper
- G. Moore, “Cramming More Components onto Integrated Circuits” CSE 371 (Roth): Performance 4
Review: Simple Datapath
- How are instruction executed?
- Fetch instruction (Program counter into instruction memory)
- Read registers
- Calculate values (adds, subtracts, address generation, etc.)
- Access memory (optional)
- Calculate next program counter (PC)
- Repeat
- Clock period = longest delay through datapath PC Insn Mem Register File s1 s2 d Data Mem
4
Recall: Processor Performance
- Programs consist of simple operations (instructions)
- Add two numbers, fetch data value from memory, etc.
- Program runtime = “seconds per program” = (instructions/program) * (cycles/instruction) * (seconds/cycle)
- Instructions per program : “dynamic instruction count”
- Runtime count of instructions executed by the program
- Determined by program, compiler, instruction set architecture (ISA)
- Cycles per instruction : “CPI” (typical range: 2 to 0.5)
- On average, how many cycles does an instruction take to execute?
- Determined by program, compiler, ISA, micro-architecture
- Seconds per cycle : clock period, length of each cycle
- Inverse metric: cycles per second (Hertz) or cycles per ns (Ghz)
- Determined by micro-architecture, technology parameters
- This unit: transistors & semiconductor technology CIS 501 (Martin/Roth): Technology 5 CSE 371 (Roth): Performance 6
Semiconductor Technology
- Basic technology element: MOSFET
- Solid-state component acts like electrical switch
- MOS : metal-oxide-semiconductor
- Conductor, insulator, semi-conductor
- FET : field-effect transistor
- Channel conducts source!drain only when voltage applied to gate
- Channel length : characteristic parameter (short! fast)
- Aka “feature size” or “technology”
- Currently: 0.032 micron (μm), 32 nanometers (nm)
- Continued miniaturization (scaling) known as “ Moore’s Law ”
- Won’t last forever, physical limits approaching (or are they?) source (^) channel drain insulator gate Substrate channel source drain gate
Complementary MOS (CMOS)
- Voltages as values
- Power (VDD) = “1”, Ground = “0”
- Two kinds of MOSFETs
- N-transistors
- Conduct when gate voltage is 1
- Good at passing 0s
- P-transistors
- Conduct when gate voltage is 0
- Good at passing 1s
- CMOS
- Complementary n-/p- networks form boolean logic (i.e., gates)
- And some non-gate elements too (important example: RAMs) power (1) ground (0) input (^) output (“node”) n-transistor p-transistor
Basic CMOS Logic Gate
- Inverter : NOT gate
- One p-transistor, one n-transistor
- Basic operation
- Input = 0
- P-transistor closed, n-transistor open
- Power charges output (1)
- Input = 1
- P-transistor open, n-transistor closed
- Output discharges to ground (0) 0 1 1 0
CIS 501 (Martin/Roth): Technology 13 1! 0 I 0! 1 1! 0 1! 0
Capacitance
- Gate capacitance
- Source/drain capacitance
- Wire capacitance
- Negligible for short wires
- Implication: number of “outputs” of gate matters 1 1 CIS 501 (Martin/Roth): Technology 14
Transistor Geometry: Width
- Transistor width , set by designer for each transistor
- Wider transistors:
- Lower resistance of channel (increases drive strength) – good!
- But, increases capacitance of gate/source/drain – bad!
- Result: set width to balance these conflicting effects
Gate
Source
Drain
Bulk Si
Width
Length
Length"
Source Drain Width "
Gate
Diagrams © Krste Asanovic, MIT CIS 501 (Martin/Roth): Technology 15
Transistor Geometry: Length & Scaling
- Transistor length : characteristic of “process generation”
- 45nm refers to the transistor gate length, same for all transistors
- Shrink transistor length:
- Lower resistance of channel (shorter) – good!
- Lower gate/source/drain capacitance – good!
- Result: switching speed improves linearly as gate length shrinks
Gate
Source
Drain
Bulk Si
Width
Length
Length "
Source Drain Width"
Gate
Diagrams © Krste Asanovic, MIT CIS 501 (Martin/Roth): Technology 16
Wire Geometry
Pitch Width Length Height
- Transistors 1-dimensional for design purposes: width
- Wires 4-dimensional: length , width , height , “pitch”
- Longer wires have more resistance
- “Thinner” wires have more resistance
- Closer wire spacing (“pitch”) increases capacitance From slides © Krste Asanovic, MIT IBM CMOS7, 6 layers of copper wiring
CIS 501 (Martin/Roth): Technology 17
Increasing Problem: Wire Delay
- RC Delay of wires
- Resistance proportional to: resistivity * length / (cross section)
- Wires with smaller cross section have higher resistance
- Resistivity (type of metal, copper vs aluminum)
- Capacitance proportional to length
- And wire spacing (closer wires have large capacitance)
- Permittivity or “dielectric constant” (of material between wires)
- Result: delay of a wire is quadratic in length
- Insert “inverter” repeaters for long wires
- Why? To bring it back to linear delay… but repeaters still add delay
- Trend: wires are getting relatively slow to transistors
- And relatively longer time to cross relatively larger chips CIS 501 (Martin/Roth): Technology 18 1! 0 I 0! 1 1! 0 1! 0
RC Delay Model Ramifications
- Want to reduce resistance
- Wide drive transistors (width specified per device)
- Short gate length
- Short wires
- Want to reduce capacitance
- Number of connected devices
- Less-wide transistors (gate capacitance of next stage)
- Short wires 1 1
Fabrication & Cost
Cost
- Metric: $
- In grand scheme: CPU accounts for fraction of cost
- Some of that is profit (Intel’s, Dell’s)
- We are concerned about chip cost
- Unit cost : costs to manufacture individual chips
- Startup cost : cost to design chip, build the manufacturing facility Desktop Laptop Netbook Phone $ $100–$300 $150-$350 $50–$100 $10–$ % of total 10–30% 10–20% 20–30% 20-30% Other costs Memory, display, power supply/battery, storage, software
CIS 501 (Martin/Roth): Technology 25
Manufacturing Defects
- Defects can arise
- Under-/over-doping
- Over-/under-dissolved insulator
- Mask mis-alignment
- Particle contaminants
- Try to minimize defects
- Process margins
- Design rules
- Minimal transistor size, separation
- Or, tolerate defects
- Redundant or “spare” memory cells
- Can substantially improve yield Defective: Defective: Slow: Correct: CIS 501 (Martin/Roth): Technology 26
Unit Cost: Integrated Circuit (IC)
- Chips built in multi-step chemical processes on wafers
- Cost / wafer is constant, f(wafer size, number of steps)
- Chip (die) cost is related to area
- Larger chips means fewer of them
- Cost is more than linear in area
- Why? random defects
- Larger chips means fewer working ones
- Chip cost ~ chip area#"
- Wafer yield : % wafer that is chips
- Die yield : % chips that work
- Yield is increasingly non-binary - fast vs slow chips
Additional Unit Cost
- After manufacturing, there are additional unit costs
- Testing: how do you know chip is working?
- Packaging: high-performance packages are expensive
- Determined by maximum operating temperature
- And number of external pins (off-chip bandwidth)
- Burn-in: stress test chip (detects unreliability chips early)
- Re-testing: how do you know packaging/burn-in didn’t damage chip?
Fixed Costs
- For new chip design
- Design & verification: ~$100M (500 person-years @ $200K per)
- Amortized over “proliferations”, e.g., Core i3, i5, i7 variants
- For new (smaller) technology generation
- ~$3B for a new fab
- Amortized over multiple designs
- Amortized by “rent” from companies that don’t fab themselves
- Moore’s Law generally increases startup cost
- More expensive fabrication equipment
- More complex chips take longer to design and verify
CIS 501 (Martin/Roth): Technology 29
All Roads Lead To Multi-Core
+ Multi-cores reduce unit costs
- Higher yield than same-area single-cores
- Why? Defect on one of the cores? Sell remaining cores for less
- IBM manufactures CBE (“cell processor”) with eight cores
- But PlayStation3 software is written for seven cores
- Yield for eight working cores is too low
- Sun manufactures Niagaras (UltraSparc T1) with eight cores
- Also sells six- and four- core versions (for less)
+ Multi-cores can reduce design costs too
- Replicate existing designs rather than re-design larger single-cores
Technology Scaling
CIS 501 (Martin/Roth): Technology 30
Moore’s Law: Technology Scaling
- Moore’s Law : aka “technology scaling”
- Continued miniaturization (esp. reduction in channel length)
- Improves switching speed, power/transistor, area(cost)/transistor
- Reduces transistor reliability
- Literally: DRAM density (transistors/area) doubles every 18 months
- Public interpretation: performance doubles every 18 months
- Not quite right, but helps performance in three ways channel source drain gate
Moore’s Effect #1: Transistor Count
- Linear shrink in each dimension
- 180nm, 130nm, 90nm, 65nm, 45nm, 32nm, …
- Each generation is a 1.414 linear shrink
- Shrink each dimension (2D)
- Results in 2x more transistors (1.414*1.414)
- Reduces cost per transistor
- More transistors can increase performance
- Job of a computer architect: use the ever-increasing number of transistors
- Examples: caches, exploiting parallelism at all levels
Summary
Technology Summary
- Has a first-order impact on computer architecture
- Cost (die area)
- Performance (transistor delay, wire delay)
- Changing rapidly
- Most significant trends for architects (and thus CIS501)
- More and more transistors
- What to do with them?! integration! parallelism
- Logic is improving faster than memory & cross-chip wires
- “Memory wall”! caches, more integration
- This unit: a quick overview, just scratching the surface Rest of semester