Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Cache Design & Performance: Miss Penalties, Write Strategies, & CPU Impact, Slides of Advanced Computer Architecture

An in-depth analysis of cache design, including cache performance, reducing miss penalties, memory hierarchy concerns, write buffer strategies, and write miss policies. Additionally, it discusses the impact of caches on cpu performance with examples and calculations.

Typology: Slides

2011/2012

Uploaded on 08/06/2012

amrusha
amrusha 🇮🇳

4.4

(32)

149 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Today‟s Topics
Recap: Cache Design
Cache Performance
Reducing Miss Penalty
Summary
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Cache Design & Performance: Miss Penalties, Write Strategies, & CPU Impact and more Slides Advanced Computer Architecture in PDF only on Docsity!

Today‟s Topics

Recap: Cache Design

Cache Performance

Reducing Miss Penalty

Summary

Recap: Memory Hierarchy Designer‟s Concerns

Block placement: Where can a block be placed in the upper level?

Block identification: How is a block found if it is in the upper level?

Block replacement: Which block should be replaced on a miss?

Write strategy: What happens on a write?

Recap: Write Buffer for Write Through

level-2 cache is introduce in between the Level-1 cache and the DRAM main memory

  • Write Allocate and
  • No-Write Allocate

Recap: Write Miss Policies

Write Allocate:

  • A block is allocated in the cache on a write miss, i.e., the block to be written is available in the cache

No-Write Allocate:

  • The blocks stay out of the cache until the program tries to read the blocks; i.e., the block is modified only in the lower level memory

Impact of Caches on CPU Performance:

Example

Assumptions  the cache miss penalty of 100 clock cycles  all instructions normally take 1 clock cycle  Average miss rate is 2%  Average memory references per instruction = 1.  Average number of cache misses per 1000 inst. = 30 Find the impact of cache on performance of CPU considering both the misses per instruction and miss rate

Impact of Caches on CPU Performance:

Example

CPU Time =

(CPU Exe. clock cycle + Memory (^) Stall cycles) x Clock Cycle Time

CPU Time (^) with cache (including cache miss)

= (IC x (1.0 + (30/1000 x 100) x clock cycle time

= IC x 4.00 x clock cycle time

CPU Time (^) with cache (including miss rate)

= (IC x (1.0 + (1.5 x 2% x 100) x clock cycle time = IC x 4.00 x clock cycle time

Cache Performance (Review)

Number of reads x read miss rate x read miss penalty + Number of write x write miss rate x write miss penalty

Averaging the read and write miss rate

Memory stall clock cycles =

Number of memory access x Misses rate x miss penalty

Average Memory Access Time = Hit Time x Misses rate x miss penalty

Cache Performance (Review)

Note that the average memory access time is an indirect measure of the CPU performance and is not substitute for the Execution Time

However, this formula can decide about the split caches (i.e., instruction cache and data cache) or unified cache

E.g., if we have to find out which of these two types of caches has lower miss rate we can use this formula as follows:

Cache Performance: Example

  • hit takes 1 clock cycle where the miss penalty is 100 cycles and
  • a load or store takes one extra cycle on unified cache

Assuming write-through caches with write- buffer and ignore stalls due to write buffer Find the average memory access time in each case

Note to solve this problem we first find the miss rate and then average memory access time

Cache Performance: Solution

1: Miss Rate

= (Misses/1000) / (Accesses/ inst.) Miss Rate (^) 16KB Inst = (3.82/1000) /1.0 = 0. Miss Rate (^) 16KB data = (40.9/1000) /0.36 = 0. As about 74% of the memory access are instructions therefore overall miss rate for split caches = (74% x 0.0038) + (26% x 0.114) = 0. Miss Rate (^) 32KB unified = (43.3/1000) /(1+0.36) = 0. i.e., the unified cache has slightly lower miss rate