Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Synchronization in Parallel Architecture-Advance Computer Architecture-Lecture Slides, Slides of Advanced Computer Architecture

Gujarat University Advanced Computer Architecture

This course focuses on quantitative principle of computer design, instruction set architectures, datapath and control, memory hierarchy design, main memory, cache, hard drives, multiprocessor architectures, storage and I/O systems, computer clusters. This lecture includes: Synchronization, Parallel, Architecture, Shared, Memory, Performance, Multiprocessor, Symmetric, Distributed

Typology: Slides

2011/2012

Uploaded on 08/06/2012

amrusha 🇮🇳

4.4

(32)

149 documents

1 / 20

This page cannot be seen from the preview

Don't miss anything!

Today’s Topics

Recap:

Performance of Multiprocessors with

–Symmetric Shared-Memory

–Distributed Shared Memory

Synchronization in Parallel Architecture

Conclusion

docsity.com

Partial preview of the text

Download Synchronization in Parallel Architecture-Advance Computer Architecture-Lecture Slides and more Slides Advanced Computer Architecture in PDF only on Docsity!

Today’s Topics

Recap: Performance of Multiprocessors with

Symmetric Shared-Memory
Distributed Shared Memory

Synchronization in Parallel Architecture

Conclusion

Recap: Cache Coherence Problem

So far we have discussed the sharing of caches for multi-processing in the:

 symmetric shared-memory architecture

 Distributed shared memory architecture

We have studied cache coherence problem in symmetric and distributed shared- memory multiprocessors; and have noticed that this problem is indeed performance- critical

Recap: Snooping Protocols

Snooping protocols employ write invalidate and write broadcast techniques

Here, the block of memory is in one of the three states, and each cached-block tracks these three states; and

the controller responds to the read/write request for a block of memory or cached block, both from the processor and from the bus

Recap: Implementation Complications of snoopy protocols

The three states of the basic FSM are: Shared, Exclusive or Invalid

However, the complications such as: write races, interventions and invalidation have been observed in the implementation of snoopy protocols; and

to overcome these complications number of variations in the FSM controller have been suggested

These variations are: MESI Protocol, Barkley Protocol and Illinois Protocol

Recap: Directory based Protocols

The larger multiprocessor systems employ distributed shared-memory , i.e., a separate memory per processor is provided

Here, the Cache Coherency is achieved using non-cached pages or directory containing information for every block in memory

The directory-based protocol tracks state of every block in every cache and finds the …..

Recap: Directory Based Protocol

…… caches having copies of block being dirty or clean

The directory-based protocol tracks state of every block in every cache and finds the caches having copies of block being dirty or clean

Similar to the Snoopy Protocol, the directory-based protocol are implemented by FSM having three states: Shared, Uncached and Exclusive

Recap: Directory Based Protocols

These protocols involve three processors or nodes, namely: local, home and remote nodes

Local node originates the request
Home node stores the memory location of an address
Remote node holds a copy of a cache block, whether exclusive or shared

Recap: Directory-based Protocol

The transactions are caused by the messages such as: read misses, write misses, invalidates or data fetch requests

These messages are sent to the directory to cause actions such as: update directory state and to satisfy requests

The controller tracks all copies of memory block; and indicates an action that updates the sharing set

Example: Working of Finite State Machine Controller

Here, if the required data is not in the cache and is available in memory associated with the respective processor, then the state machine is said to be in Uncached state; and transition to other states is caused by messages such as: read miss, write miss, invalidates and data fetch request

Example: Dealing with read/write misses

A1 and A2 map to the same cache block

step P1State^ Addr Value P2State^ Addr Value BusAction^ Proc. Addr Value DirectoryAddr State^ {Procs} MemoryValue P1: Write 10 to A P1: Read A1P2: Read A

P2: Write 40 to A

P2: Write 20 to A

Processor 1 Processor 2 Interconnect Directory Memory

Example: Working of Finite State Machine Controller

the state transition from Uncached to exclusive takes place – these operations are shown here in red color

step P1State^ Addr Value P2State^ Addr Value BusAction^ Proc. Addr Value DirectoryAddr State^ {Procs} MemoryValue P1: Write 10 to A1 (^) Excl. A1 10 W rMsDaRp P1P1 A1A1 0 A1 Ex {P1} P1: Read A1P2: Read A

P2: Write 40 to A2 P2: Write 20 to A

Processor 1 Processor 2 (^) Interconnect Directory Memory

Example: Working of Finite State Machine Controller

At Step 2 – P1 reads A1; CPU read HITs occurs, hence the FSM Stays in exclusive state

P2: Write 40 to A

P2: Write 20 to A

Processor 1 Processor 2 (^) Interconnect Directory Mem

Example: Working of FSM Controller

P2: Write 20 to A

A1 and A2 map to the same cache block

P1 P2 Bus Directory Memory step State Addr Value State Addr Value Action Proc. Addr Value Addr State {Procs} Value P1: Write 10 to A1 W rMs P1 A1 A1 Ex {P1} Excl. A1 10 DaRp P1 A1 0 P1: Read A1 Excl. A1 10 P2: Read A1 Shar. A1 RdMs P2 A Shar. A1 10 Ftch P1 A1 10 10 Shar. A1 10 DaRp P2 A1 10 A1 Shar.{P1,P2} 10 10 10 P2: Write 40 to A2 10

Processor 1 (^) Processor 2 Interconnect Memory

Example: Working of Finite State Machine Controller

At Step 4: P2 write 20 to A

i) As A1 and A2 maps to the same cache block; P1 find a remote write, so the state of the controller changes from shared to Invalid ii) P2 find a CPU write, so places write miss on the bus and changes the state from shared to exclusive and writes value 20 to A iii) The director addresses to A1 with sharer-set containing {P2}

Synchronization in Parallel Architecture-Advance Computer Architecture-Lecture Slides, Slides of Advanced Computer Architecture

Related documents

Partial preview of the text

Download Synchronization in Parallel Architecture-Advance Computer Architecture-Lecture Slides and more Slides Advanced Computer Architecture in PDF only on Docsity!

Today’s Topics

Conclusion

Recap: Cache Coherence Problem

Recap: Snooping Protocols

Recap: Directory based Protocols

Example: Working of Finite State Machine Controller

Example: Dealing with read/write misses

Example: Working of Finite State Machine Controller

Example: Working of Finite State Machine Controller

Example: Working of FSM Controller

Example: Working of Finite State Machine Controller