Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

lab on cache direct mapping, set associative , Exercises of Computer Science

University of Rajasthan Computer Science

it explains the cache principle with simulation too;

Typology: Exercises

2016/2017

Uploaded on 12/24/2017

rajesh-tiwary 🇮🇳

1 document

1 / 11

This page cannot be seen from the preview

Don't miss anything!

Review!

How is this cache different if…!

- the block is 4 words?!

- the index field is 12 bits?!

...

1022

1023

Index Tag Data Valid

Address (32 bits)

Hit

10 20

Tag

2 bits

Mux

Data

8 8 8 8

2-way set associative implementation!

...

Index Tag Data Valid

Address (m bits)

Hit

k (m-k-n)

Tag

2-to-1 mux

Data

Tag Valid Data

Index Block

offset

Compare a 2-way cache set

associative cache with a

fully-associative cache?!

Only 2 comparators!

!needed!

Cache tags are a little!

!shorter too!

… deciding replacement?!

Partial preview of the text

Download lab on cache direct mapping, set associative and more Exercises Computer Science in PDF only on Docsity!

Review

How is this cache different if…

- the block is 4 words?

- the index field is 12 bits?

Index Valid Tag Data Address (32 bits) = Hit

Tag 2 bits Mux Data

2-way set associative implementation

2 k Index Valid Tag Data Address (m bits) = Hit (m-k-n) k Tag 2-to-1 mux Data 2 n Valid Tag Data 2 n 2 n

Index (^) offsetBlock

Compare a 2-way cache set

associative cache with a

fully-associative cache?

Only 2 comparators

needed

Cache tags are a little

shorter too

… deciding replacement?

Set associative caches are a general idea

By now you have noticed the 1-way set associative

cache is the same as a direct-mapped cache

Similarly, if a cache has 2 k^ blocks, a 2 k -way set

associative cache would be the same as a fully-

associative cache

Set 0 1 2 3 Set 0 1 Set 1-way 8 sets, 1 block each 2-way 4 sets, 2 blocks each 4-way 2 sets, 4 blocks each 0 Set 8-way 1 set, 8 blocks direct mapped fully associative 4

Summary

Larger block sizes can take advantage of spatial

locality by loading data from not just one address,

but also nearby addresses, into the cache

Associative caches assign each memory address to a

particular set within the cache, but not to any

specific block within that set

 Set sizes range from 1 (direct-mapped) to 2 k^ (fully

associative)

 Larger sets and higher associativity lead to fewer cache

conflicts and lower miss rates, but they also increase the

hardware cost

 In practice, 2-way through 16-way set-associative caches

strike a good balance between lower miss rates and higher

costs

Next, we’ll talk more about measuring cache

performance, and also discuss the issue of writing

data to a cache

Inconsistent memory

But now the cache and memory contain different,

inconsistent data!

First Rule of Data Management: No inconsistent data

Second Rule: Don’t Even Think About Violating 1st^ Rule

How can we ensure that subsequent loads will return

the right value?

This is also problematic if other devices are sharing the

main memory, as in I/O or a multiprocessor system

Index V Tag Data Address ... 110 ...

Data 42803

Write-through caches

A write-through cache solves the inconsistency

problem by forcing all writes to update both the

cache and the main memory.

This is simple to implement and keeps the cache and

memory consistent

Why might it be not so good?

Index V Tag Data Address ... 110 ...

Data 21763

Mem[ 214 ] = 21763

Write-through caches

A write-through cache solves the inconsistency

problem by forcing all writes to update both the

cache and the main memory.

This is simple to implement and keeps the cache and

memory consistent.

The bad thing is that forcing every write to go to main

memory, we use up bandwidth between the cache

and the memory.

Index V Tag Data Address ... 110 ...

Data 21763

Mem[ 214 ] = 21763 10

Write buffers

Write-through caches can result in slow writes, so processors

typically include a write buffer, which queues pending writes to

main memory and permits the CPU to continue …

Buffers are commonly used when two devices run at different

speeds

 If a producer generates data too quickly for a consumer to handle,

the extra data is stored in a buffer and the producer can continue

on with other tasks, without waiting for the consumer

 Conversely, if the producer slows down, the consumer can

continue running at full speed as long as there is excess data in

the buffer

For us, the producer is the CPU and the consumer is the main

memory

Producer Buffer Consumer

Finishing the write back

We don’t need to store the new value back to main

memory unless the cache block gets replaced

For example, on a read from Mem[ 142 ], which maps to

the same cache block, the modified cache contents

will first be written to main memory

Only then can the cache block be replaced with data

from address 142

Index Tag^ Data ... 110 ...

Address Data 21763

Dirty 0

V

Dirty 1 Index Tag^ Data ... 110 ...

Address Data 21763

V

Write-back cache discussion

The advantage of write-back caches is that not

all write operations need to access main

memory, as with write-through caches

 If a single address is frequently written to, then it

doesn’t pay to keep writing that data through to

main memory

 If several bytes within the same cache block are

modified, they will only force one memory write

operation at write-back time

Write-back cache discussion

Each block in a write-back cache needs a dirty bit to

indicate whether or not it must be saved to main

memory before being replaced—otherwise we might

perform unnecessary writebacks

Notice the penalty for the main memory access will not

be applied until the execution of some subsequent

instruction following the write

 In our example, the write to Mem[ 214 ] affected only the

cache

 But the load from Mem[ 142 ] resulted in two memory

accesses: one to save data to address 214 , and one to load

data from address 142

The write can be “buffered” as was shown in write-through 16

Write misses

A second scenario is if we try to write to an address

that is not already contained in the cache; this is

called a write miss.

Let’s say we want to store 21763 into Mem[1101 0110]

but we find that address is not currently in the cache.

When we update Mem[1101 0110], should we also load

it into the cache?

Index V Tag Data Address ... 110 ...

Data 6378

Which is it?

Given the following trace of accesses, can you

determine whether the cache is write-allocate or

write-no-allocate?

 Assume A and B are distinct, and can be in the cache

simultaneously.

Load A

Store B

Store A

Load A

Load B

Load A

Miss

Hit

Which is it?

Given the following trace of accesses, can you

determine whether the cache is write-allocate or

write-no-allocate?

 Assume A and B are distinct, and can be in the cache

simultaneously.

Load A

Store B

Store A

Load A

Load B

Load A

Miss

Hit

On a write-allocate

cache this would

be a hit

Answer: Write-no-allocate

First Observations

Split Instruction/Data caches:

 Pro: No structural hazard between IF & MEM stages

A single-ported unified cache stalls fetch during load or store

 Con: Static partitioning of cache between instructions &

data

Bad if working sets unequal: e.g., code/ DATA or CODE /data

Cache Hierarchies:

 Trade-off between access time & hit rate

L1 cache can focus on fast access time (okay hit rate)
L2 cache can focus on good hit rate (okay access time)

lab on cache direct mapping, set associative , Exercises of Computer Science

Related documents

Partial preview of the text

Download lab on cache direct mapping, set associative and more Exercises Computer Science in PDF only on Docsity!

Review

How is this cache different if…

- the block is 4 words?

- the index field is 12 bits?

2-way set associative implementation

Compare a 2-way cache set

associative cache with a

fully-associative cache?

Only 2 comparators

needed

Cache tags are a little

shorter too

… deciding replacement?

Set associative caches are a general idea

By now you have noticed the 1-way set associative

cache is the same as a direct-mapped cache

Similarly, if a cache has 2 k^ blocks, a 2 k -way set

associative cache would be the same as a fully-

associative cache

Summary

Larger block sizes can take advantage of spatial

locality by loading data from not just one address,

but also nearby addresses, into the cache

Associative caches assign each memory address to a

particular set within the cache, but not to any

specific block within that set

 Set sizes range from 1 (direct-mapped) to 2 k^ (fully

associative)

 Larger sets and higher associativity lead to fewer cache

conflicts and lower miss rates, but they also increase the

hardware cost

 In practice, 2-way through 16-way set-associative caches

strike a good balance between lower miss rates and higher

costs

Next, we’ll talk more about measuring cache

performance, and also discuss the issue of writing

data to a cache

Inconsistent memory

But now the cache and memory contain different,

inconsistent data!

First Rule of Data Management: No inconsistent data

Second Rule: Don’t Even Think About Violating 1st^ Rule

How can we ensure that subsequent loads will return

the right value?

This is also problematic if other devices are sharing the

main memory, as in I/O or a multiprocessor system

Write-through caches

A write-through cache solves the inconsistency

problem by forcing all writes to update both the

cache and the main memory.

This is simple to implement and keeps the cache and

memory consistent

Why might it be not so good?

Write-through caches

A write-through cache solves the inconsistency

problem by forcing all writes to update both the

cache and the main memory.

This is simple to implement and keeps the cache and

memory consistent.

The bad thing is that forcing every write to go to main

memory, we use up bandwidth between the cache

and the memory.

Write buffers

Write-through caches can result in slow writes, so processors

typically include a write buffer, which queues pending writes to

main memory and permits the CPU to continue …

Buffers are commonly used when two devices run at different

speeds

 If a producer generates data too quickly for a consumer to handle,

the extra data is stored in a buffer and the producer can continue

on with other tasks, without waiting for the consumer

 Conversely, if the producer slows down, the consumer can

continue running at full speed as long as there is excess data in

the buffer

For us, the producer is the CPU and the consumer is the main

memory