Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

MIPS Rating - Computer Architecture and Engineering - Exams, Exams of Computer Architecture and Organization

Main points of this past exam are: Mips Rating, Yield Runtime, Floating Point, Lower-Performance Version, Original Processor, Mips Rating, Original Cost, New Processor, Parallel Prefix, Possible Speedup

Typology: Exams

2012/2013

Uploaded on 04/02/2013

shashikanth_0p3
shashikanth_0p3 🇮🇳

4.8

(8)

55 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
University of California, Berkeley
College of Engineering
Computer Science Division EECS
Spring 2001
John Kubiatowicz
Midterm I
March 1, 2001
CS152 Computer Architecture and Engineering
Your Name:
SID Number:
Discussion Section:
Problem Possible Score
1 20
2 20
3 30
4 30
Total
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download MIPS Rating - Computer Architecture and Engineering - Exams and more Exams Computer Architecture and Organization in PDF only on Docsity!

University of California, Berkeley College of Engineering Computer Science Division  EECS

Spring 2001 John Kubiatowicz

Midterm I

March 1, 2001 CS152 Computer Architecture and Engineering

Your Name:

SID Number:

Discussion Section:

Problem Possible Score

1 20

2 20

3 30

4 30

Total

[ This page left for π ]

Problem 1c: What is the CPI and MIPS rating of the new processor?

Problem 1d: What is the original cost per (working) processor?

Problem 1e: What is the new cost per (working) processor?

Problem 1f: Assume that we are considering the other direction of improving the original processor by increasing the speed of floating point. What is the best possible speedup that we could get, and what would the CPI and MIPS rating be of the new processor?

Problem 2: Parallel Prefix

Assume the following characteristics for NAND gates: Input load: 120fF, Internal delay: TPlh=0.3ns, TPhl=0.6ns, Load-Dependent delay: TPlhf=.0020ns, TPhlf=.0021ns

Problem 2a: Suppose that we construct an XOR, as follows:

Compute the standard parameters for the linear delay models for this complex gate, assuming the parameters given above for the NAND gate. Assume that a wire doubles the input capacitance of the gate that it is attached to:

A Input Capacitance: Load-dependent Delays: B Input Capacitance : TPAYlhf: TPAYhlf: TPBYlhf: TPBYhlf:

Maximum Internal delays for A⇒Y: TPAYlh:

TPAYhl:

A

B

Y

Problem 2c: Now, put these 2-input blocks together to produce a 4-input block that takes I 0 , I 1 , I 2 , and I 3 , and C (^) down and produces: O 0 = I 0 ⊕ C (^) down O 1 = I 1 ⊕ I 0 ⊕ C (^) down O 2 = I 2 ⊕ I 1 ⊕ I 0 ⊕ C (^) down O 3 = I 3 ⊕ I 2 ⊕ I 1 ⊕ I 0 ⊕ C (^) down C (^) up = I 3 ⊕ I 2 ⊕ I 1 ⊕ I 0 Your goal is to minimize the output delay of each block.

Compute the input capacitance for each input:

Identify the critical path of your circuit and compute the unloaded delay for this path.

Problem 2d: Finally, show how the 4 input prefix circuit can be used as a building block to produce a 16- element prefix circuit that minimizes gate reuse and which has minimal delay. What is the critical path and how many XOR gates are in it?

Hint: this is very similar to a carry-lookahead adder.

Problem 2e :

How many XOR gates are in the critical path of a 64-bit parallel-prefix circuit?

Recall how divide (in base 10) works The following shows a division of 1 by 23:

Suppose we had a procedure that produced each of the digits (zeros) in the dividend, one at a time. Consider the remainders as integers from the current decimal point. So, for instance, we have the remainders 1, 10, 100, 80, 110, 180, etc. At each stage, we multiply by ten, add the incoming digit (zero in the example), then

This could be combined with the current remainder but multiplying the remainder by 10, adding the new digit (which is zero in this case), then seeing how much the result divides the answer.

Here is complete pseudo code for computing one of the streams:

Stream (digitnum,incoming,oddnum,sign,xsquared,termID,maxtermID) { ARemainder = A_REMARRAY [termID]; ARemainder = ARemainder × 10 + incoming;

; This is a quotient/remainder operation (ADigit, ARemainder) = ARemainder / xsquared; A_REMARRAY[termID] = ARemainder;

BRemainder = B_REMARRAY [termID]; BRemainder = BRemainder × 10 + Adigit; (BDigit, BRemainder) = BRemainder / oddnum; B_REMARRAY[termID] = BRemainder;

AddInDigit (BDigit, digitnum, sign);

If ((termID = maxtermID ) && (ADigit != 0)) { A_REMARRAY[termID+1] = 0; maxtermID++; }

If (termID < maxtermID) { Stream (digitnum, ADigit,(oddnum+2),-sign, xsquared, (termID+1), maxtermID); } }

Remainders

Problem 3a: Write MIPS assembly for this pseudo code. Make sure to adhere to MIPS conventions. Assume that A_REMARRAY[] and B_REMARRAY[] are word arrays that are addressed via constants (assume that you can use the la pseudo instruction to load their addresses into registers. Also, assume that there are 7 argument registers ($a0 - $a6) for the sake of this problem. Note that AddInDigit is a procedure call.

Problem 3c: Explain the initialization of the A_REMVALUE[] and B_REMVALUE[] arrays if we were

going to compute  

4 arctan. What is the purpose of the termID and maxtermID

parameters?

Problem 3d: Explain the initialization of the FINALVALUE array:

Problem 3e:

Write pseudo-code to compute (^)  

4 arctan using stream (). Assume that the initialization in

(3c) and (3d) are accomplished..

[ This page intentionally left blank]

In class, we made our multicycle machine support the following six MIPS instructions:

op | rs | rt | rd | shamt | funct = MEM[PC] op | rs | rt | Imm16 = MEM[PC]

INST Register Transfers ADDU R[rd] ← R[rs] + R[rt]; PC ← PC + 4 SUBU R[rd] ← R[rs] - R[rt]; PC ← PC + 4 ORI R[rt] ← R[rs] + zero_ext(Imm16); PC ← PC + 4 LW R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4 SW MEM[R[rs] + sign_ext(Imm16)] ← R[rs]; PC ← PC + 4 BEQ if ( R[rs] == R[rt] ) then PC ← PC + 4 + sign_ext(Imm16) || 00 else PC ← PC + 4 For your reference, here is the microcode for two of the 6 MIPS instructions:

Label ALU SRC1 SRC2 ALUDest Memory MemReg PCWrite Sequence Fetch Add PC 4 ReadPC IR ALU Seq Dispatch Add PC ExtShft Dispatch

RType Func rs rt Seq rd-ALU Fetch BEQ Sub rs rt ALUoutCond Fetch

In this problem, we are going to add four new instructions to this data path:

jal ⇒ PC ← zero_ext(Instr[25:0]) || R[31] ← PC + 4

add $rd, $rs, $rt ⇒ if (R[rs]+ R[rt] doesn’t overflow) then R[rd] ← R[rs] + R[rt] PC←PC+ Else EPC←PC Cause← 12 PC←0x mfc0 $rd, $rt if ($rt == 13) then R[rd] ←Cause Else if ($rt == 14) then R[rd] ←EPC PC←PC+

compmul $rd, $rs, $rt ⇒ R[rd]=(R[rs]×R[rt]) – (R[rs+1]×R[rt+1]) R[rd+1]= (R[rs]×R[rt])+(R[rs+1]×R[rt+1]) PC←PC+ This math was a typo. The real way to compute complex multiply is: compmul $rd, $rs, $rt ⇒ R[rd]=(R[rs]×R[rt]) – (R[rs+1]×R[rt+1]) R[rd+1]= (R[rs]×R[rt+1])+(R[rs+1]×R[rt]) PC←PC+

  1. The jal instruction is familiar to you from the normal MIPS instruction set.
  2. The add instruction is a normal add except that it causes an overflow exception if there is overflow. You need to implement the EPC (error PC) and Cause registers. Just assume that EPC gets the PC of the bad instruction and Cause gets the number 12.
  3. The mfc0 instruction is used to get the EPC and Cause values into normal registers
  4. The compmul instruction does a complex multiply. It is assumed that the registers rd, rs, and rt are even registers and that the two source complex values are in R[rs], R[rs+1] (real, imaginary) and R[rt], R[rt+1] (real, imaginary), and that the results are put into R[rd] and R[rd+1] (real,imaginary).

Problem 4a: How wide are microinstructions in the original datapath (answer in bits and show some work!)?

Problem 4b: Draw a block diagram of a microcontroller that will support the new instructions (it will be slightly different than that required for the original instructions). Include sequencing hardware, the dispatch ROM, the microcode ROM, and decode blocks to turn the fields of the microcode into control signals. Make sure to show all of the control signals coming from somewhere. ( hint: The PCWr, PCWrCond, and PCSrc signals must come out of a block connected to thePCWrite field of the microinstruction).

Problem 4d: Describe changes to the microinstruction assembly language for these new instructions. How wide are your microinstructions now?

Problem 4e: Write complete microcode for the new instructions. Include the Fetch and Dispatch microinstructions. If any of the microcode for the original instructions must change, explain how ( Hint: since the original instructions did not use R[rd] as a register input, you must make sure that your changes do not mess up the original instructions).