Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Computer Storage & Manipulation: Binary, Signed Magnitude, Booth's Algorithm, & Floating-P, Study notes of Computer Science

Various ways computers store and manipulate numbers using binary, signed magnitude, Booth's algorithm, and floating-point representation. It covers topics like word size, number base conversions, overflow, signed magnitude numbers, Booth's algorithm for multiplication, and floating-point representation. The document also includes examples and explanations of concepts.

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

journalyyy
journalyyy 🇬🇧

4.7

(12)

215 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CHAPTER 2
Data Representation in Computer Systems
2.1 Introduction 47
2.2 Positional Numbering Systems 48
2.3 Converting Between Bases 48
2.3.1 Converting Unsigned Whole Numbers 49
2.3.2 Converting Fractions 51
2.3.3 Converting between Power-of-Two Radices 54
2.4 Signed Integer Representation 54
2.4.1 Signed Magnitude 54
2.4.2 Complement Systems 60
2.4.3 Unsigned Versus Signed Numbers 66
2.4.4 Computers, Arithmetic, and Booth’s Algorithm 66
2.4.5 Carry Versus Overflow 70
2.4.6 Binary Multiplication and Division Using Shifting 71
2.5 Floating-Point Representation 73
2.5.1 A Simple Model 74
2.5.2 Floating-Point Arithmetic 76
2.5.3 Floating-Point Errors 78
2.5.4 The IEEE-754 Floating-Point Standard 79
2.5.5 Range, Precision, and Accuracy 81
2.5.6 Additional Problems with Floating-Point Numbers 82
2.6 Character Codes 85
2.6.1 Binary-Coded Decimal 86
2.6.2 EBCDIC 87
2.6.3 ASCII 88
2.6.4 Unicode 88
2.7 Error Detection and Correction 92
2.7.1 Cyclic Redundancy Check 92
2.7.2 Hamming Codes 95
2.7.3 Reed-Soloman 102
Chapter Summary 103
CMPS375 Class Notes (Chap02) Page 1 / 20 by Kuo-pao Yang
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Computer Storage & Manipulation: Binary, Signed Magnitude, Booth's Algorithm, & Floating-P and more Study notes Computer Science in PDF only on Docsity!

Data Representation in Computer Systems

 - CHAPTER 
  • 2.1 Introduction
  • 2.2 Positional Numbering Systems
  • 2.3 Converting Between Bases
    • 2.3.1 Converting Unsigned Whole Numbers
    • 2.3.2 Converting Fractions
    • 2.3.3 Converting between Power-of-Two Radices
  • 2.4 Signed Integer Representation
    • 2.4.1 Signed Magnitude
    • 2.4.2 Complement Systems
    • 2.4.3 Unsigned Versus Signed Numbers
    • 2.4.4 Computers, Arithmetic, and Booth’s Algorithm
    • 2.4.5 Carry Versus Overflow
    • 2.4.6 Binary Multiplication and Division Using Shifting
  • 2.5 Floating-Point Representation
    • 2.5.1 A Simple Model
    • 2.5.2 Floating-Point Arithmetic
    • 2.5.3 Floating-Point Errors
    • 2.5.4 The IEEE-754 Floating-Point Standard
    • 2.5.5 Range, Precision, and Accuracy
    • 2.5.6 Additional Problems with Floating-Point Numbers
  • 2.6 Character Codes
    • 2.6.1 Binary-Coded Decimal
    • 2.6.2 EBCDIC
    • 2.6.3 ASCII
    • 2.6.4 Unicode
  • 2.7 Error Detection and Correction
    • 2.7.1 Cyclic Redundancy Check
    • 2.7.2 Hamming Codes
    • 2.7.3 Reed-Soloman
  • Chapter Summary

2.1 Introduction 47

  • This chapter describes the various ways in which computers can store and manipulate numbers and characters.
  • Bit: The most basic unit of information in a digital computer is called a bit , which is a contraction of binary digit.
  • Byte: In 1964, the designers of the IBM System/360 main frame computer established a convention of using groups of 8 bits as the basic unit of addressable computer storage. They called this collection of 8 bits a byte.
  • Word: Computer words consist of two or more adjacent bytes that are sometimes addressed and almost always are manipulated collectively. The word size represents the data size that is handled most efficiently by a particular architecture. Words can be 16 bits, 32 bits, 64 bits.
  • Nibbles: Eight-bit bytes can be divided into two 4-bit halves call nibbles.

2.2 Positional Numbering Systems 48

  • Radix (or Base): The general idea behind positional numbering systems is that a numeric value is represented through increasing powers of a radix (or base).

System Radix Allowable Digits

Decimal 10 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Binary 2 0, 1 Octal 8 0, 1, 2, 3, 4, 5, 6, 7 Hexadecimal 16 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F

TABLE 2.1 Some Number to Remember

  • EXAMPLE 2.1 Three numbers represented as powers of a radix.

243.5110 = 2 * 10^2 + 4 * 10^1 + 3 * 10^0 + 5 * 10-1^ + 1 * 10 - 212 3 = 2 * 3 2 + 1 * 3^1 + 2 * 3 0 = 23 (^10)

2.3.2 Converting Fractions 51

  • EXAMPLE 2.6 Convert 0.4304 10 to base 5.

0.4304 10 = 0.2034 5

  • EXAMPLE 2.7 Convert 0.34375 10 to binary with 4 bits to the right of the binary point. Reading from top to bottom, 0.34375 10 = 0.0101 2 to four binary places. We simply discard (or truncate) our answer when the desired accuracy has been achieved.

0.3437510 = 0.

X 2 0. X 2 1. X 2 0. X 2 1.

  • EXAMPLE 2.8 Convert 3121 4 to base 3

First, convert to decimal 31214 = 217 10 Then convert to base 3 217 10 = 22001 3 We have 3121 4 = 22001 3

2.3.3 Converting between Power-of-Two Radices 54

  • EXAMPLE 2.9 Convert 110010011101 2 to octal and hexadecimal.

110 010 011 1012 = 6235 8 Separate into groups of 3 for octal conversion

1100 1001 11012 = C9D 16 Separate into groups of 4 for octal conversion

2.4 Signed Integer Representation 54

  • By convention, a “1” in the high-order bit indicate a negative number.

2.4.1 Signed Magnitude 54

  • A signed-magnitude number has a sign as its left-most bit (also referred to as the high-order bit or the most significant bit) while the remaining bits represent the magnitude (or absolute value ) of the numeric value.
  • N bits can represent –(2n-1^ - 1) to 2 n-1^ -
  • EXAMPLE 2.10 Add 01001111 2 to 00100011 2 using signed-magnitude arithmetic.

01001111 2 (79) + 00100011 2 (35) = 01110010 2 (114) There is no overflow in this example

  • EXAMPLE 2.11 Add 01001111 2 to 01100011 2 using signed-magnitude arithmetic.
  • An overflow condition and the carry is discarded , resulting in an incorrect sum.

We obtain the erroneous result of 010011112 (79) + 01100011 2 (99) = 0110010 2 (50)

  • EXAMPLE 2.12 Subtract 01001111 2 from 01100011 2 using signed-magnitude arithmetic.

We find 011000011 2 (99) - 01001111 2 (79) = 00010100 2 (20) in signed-magnitude representation.

• EXAMPLE 2.

• EXAMPLE 2.

  • The signed magnitude has two representations for zero , 10000000 and 00000000 (and mathematically speaking, the simple shouldn’t happen!).

2.4.2 Complement Systems 60

  • One’s Complement o This sort of bit-flipping is very simple to implement in computer hardware. o EXAMPLE 2.16 Express 2310 and -910 in 8-bit binary one’s complement form.

2310 = + (00010111 2 ) = 00010111 2 -9 10 = - (00001001 2 ) = 11110110 2

o EXAMPLE 2. o EXAMPLE 2.

  • The primary disadvantage of one’s complement is that we still have two representations for zero : 00000000 and 11111111

o N bits can represent –(2n-1^ ) to 2n-1^ -1. With signed-magnitude number, for example, 4 bits allow us to represent the value -7 through +7. However using two’s complement, we can represent the value -8 through +7.

  • Integer Multiplication and Division o For each digit in the multiplier, the multiplicand is “shifted” one bit to the left. When the multiplier is 1, the “shifted” multiplicand is added to a running sum of partial products. o EXAMPLE Find the product of 00000110 2 (6 10 ) and 00001011 2 (1110 ).

00000110 ( 6) x 00001011 (11)

Multiplicand Partial Products 00000110 + 00000000 (1; add multiplicand and shift left) 00001100 + 00000110 (1; add multiplicand and shift left) 00011000 + 00010010 (0; Don’t add, just shift multiplicand left) 00110000 + 00010010 (1; add multiplicand and shift left) = 01000010 ( Product ; 6 X 11 = 66 )

o When the divisor is much smaller than the dividend, we get a condition known as divide underflow , which the computer sees as the equivalent of division by zero. o Computer makes a distinction between integer division and floating-point division. ƒ With integer division, the answer comes in two parts: a quotient and a remainder. ƒ Floating-point division results in a number that is expressed as a binary fraction. ƒ Floating-point calculations are carried out in dedicated circuits call floating- point units, or FPU.

2.4.3 Unsigned Versus Signed Numbers 66

  • If the 4-bit binary value 1101 is unsigned, then it represents the decimal value 13, but as a signed two’s complement number, it represents -3.
  • C programming language has int and unsigned int as possible types for integer variables.
  • If we are using 4-bit unsigned binary numbers and we add 1 to 1111, we get 0000 (“return to zero”).
  • If we add 1 to the largest positive 4-bit two’s complement number 0111 (+7), we get 1000 (-8).

2.4.4 Computers, Arithmetic, and Booth’s Algorithm 66

  • Consider the following standard pencil and paper method for multiplying two’s complement numbers (-5 X -4): 1011 (-5) x 1100 (-4) + 0000 (0 in multiplier means simple shift) + 0000 (0 in multiplier means simple shift) + 1011 (1 in multiplier means add multiplicand and shift) + 1011____ (1 in multiplier means add multiplicand and shift) 10000100 (-4 X -5 = -124 ) Note that: “Regular” multiplication clearly yields the incorrect result.
  • Research into finding better arithmetic algorithms has continued apace for over 50 years. One of the many interesting products of this work is Booth’s algorithm.
  • In most cases, Booth’s algorithm carries out multiplication faster and more accurately than naïve pencil-and-paper methods.
  • The general idea of Booth’s algorithm is to increase the speed of a multiplication when there are consecutive zeros or ones in the multiplier.
  • Consider the following standard multiplication example (3 X 6):

0011 (3) x 0110 (6) + 0000 (0 in multiplier means simple shift) + 0011 (1 in multiplier means add multiplicand and shift) + 0011 (1 in multiplier means add multiplicand and shift) + 0000____ (0 in multiplier means simple shift) 0010010 (3 X 6 = 18 )

  • EXAMPLE 2.24 Let’s look at the larger example of 53 X 126:

00110101 (53; for subtracting, we will add the complement of 53 or 11001011) x 01111110 (126) +0000000000000000 (00 = simple shift ) +111111111001011 (10 = subtract = add 11001011, sign extension) +00000000000000 (11 = simple shift ) +0000000000000 (11 = simple shift ) +000000000000 (11 = simple shift ) +00000000000 (11 = simple shift ) +0000000000 (11 = simple shift ) +000110101____ (01 = add 00110101, sign extension) 10001101000010110 (53 X 126 = 6678; using the 16 rightmost bits) Note that: Ignore extended sign bit that go beyond 2n.

  • Booth’s algorithm not only allows multiplication to be performed faster in most cases, but it also has the added bonus in that it works correctly on signed numbers.

2.4.5 Carry Versus Overflow 70

  • For unsigned numbers, a carry (out of the leftmost bit) indicates the total number of bits was not large enough to hold the resulting value, and overflow has occurred.
  • For signed numbers, if the carry in to the sign bit and the carry (out of the sign bit) differ , then overflow has occurred.

TABLE 2.2 Examples of Carry and Overflow in Signed Numbers

2.4.6 Binary Multiplication and Division Using Shifting 71

  • We can do binary multiplication and division by 2 very easily using an arithmetic shift operation
  • A left arithmetic shift inserts a 0 in for the rightmost bit and shifts everything else left one bit; in effect, it multiplies by 2
  • A right arithmetic shift shifts everything one bit to the right, but copies the sign bit; it divides by 2
  • EXAMPLE 2.25: Multiply the value 11 (expressed using 8-bit signed two’s complement representation) by 2.

We start with the binary value for 11: 00001011 (+11) We shift left one place, resulting in: 00010110 (+22) The sign bit has not changed, so the value is valid.

To multiply 11 by 4 , we simply perform a left shift twice.

  • EXAMPLE 2.28: Divide the value 12 (expressed using 8-bit signed two’s complement representation) by 2.

We start with the binary value for 12: 00001100 (+12) We shift left one place, resulting in: 00000110 (+6) (Remember, we carry the sign bit to the left as we shift.)

To divide 12 by 4 , we right shift twice.

2.5.2 Floating-Point Arithmetic 76

  • EXAMPLE 2.32: Add the following binary numbers as represented in a normalized 14-bit format, using the simple model with a bias of 16.

0 10010 11001000

0 10000 10011010

**11.

11.** Renormalizing we retain the larger exponent and truncate the low-order bit.

0 10010 11101110

  • EXAMPLE 2.33 Multiply:

(^0 10010 11001000) = 0.11001000 X 2 2 X (^0 10000 10011010) = 0.10011010 X 2 0

11001000 x 10011010 00000000 11001000 00000000 11001000 11001000 00000000 00000000 11001000_______ 111100001010000 Renormalizing 0.0111100001010000 * 2^2 = 0.11110000 1010000 * 2^1 we retain the larger exponent and truncate the low-order bit.

0 10001 11110000

2.5.3 Floating-Point Errors 78

  • We intuitively understand that we are working in the system of real number. We know that this system is infinite.
  • Computers are finite systems, with finite storage. The more bits we use, the better the approximation. However, there is always some element of error, no matter how many bits we use.

2.5.4 The IEEE-754 Floating-Point Standard 79

  • The IEEE-754 single precision floating point standard uses bias of 127 over its 8-bit exponent. An exponent of 255 indicates a special value.
  • The double precision standard has a bias of 1023 over its 11-bit exponent. The “special” exponent value for a double precision number is 2047, instead of the 255 used by the single precision standard.

Spe cial bit patterns in IEEE-7 54

2.5.5 Range, Precision, and Accuracy 81

  • The range of a numeric integer format is the difference between the largest and smallest values that is can express.
  • The precision of a number indicates how much information we have about a value
  • Accuracy refers to how closely a numeric representation approximates a true value.

2.5.6 Additional Problems with Floating-Point Numbers 82

  • Because of truncated bits, you cannot always assume that a particular floating point operation is commutative or distributive.

This means that we cannot assume: (a + b) + c = a + (b + c) or a * (b + c) = ab + ac

2.6.3 ASCII 88

  • ASCII: American Standard Code for Information Interchange
  • In 1967, a derivative of this alphabet became the official standard that we now call ASCII.

2.6.4 Unicode 88

  • Both EBCDIC and ASCII were built around the Latin alphabet.
  • In 1991, a new international information exchange code called Unicode.
  • Unicode is a 16-bit alphabet that is downward compatible with ASCII and Latin- character set.
  • Because the base coding of Unicode is 16 bits, it has the capacity to encode the majority of characters used in every language of the world.
  • Unicode is currently the default character set of the Java programming language.

TABLE 2.8 Unicode Codespace

  • The Unicode codespace is divided into six parts. The first part is for Western alphabet codes, including English, Greek, and Russian.
  • The lowest-numbered Unicode characters comprise the ASCII code.
  • The highest provide for user-defined codes.

2.7 Error Detection and Correction 92

  • No communications channel or storage medium can be completely error-free.

2.7.1 Cyclic Redundancy Check 92

  • Cyclic redundancy check (CRC) is a type of checksum used primarily in data communications that determines whether an error has occurred within a large block or stream of information bytes.
  • Arithmetic Modulo 2The rules are as follows:

0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 0

  • EXAMPLE 2.35 Find the sum of 1011 2 and 110 2 modulo 2.

10112 + 110 2 = 1101 2 (mod 2)

  • EXAMPLE 2.36 Find the quotient and remainder when 1001011 2 is divided by 1011 2.

Quotient 1010 2 and Remainder 101 2

  • Calculating and Using CRC o Suppose we want to transmit the information string: 1001011 2. o The receiver and sender decide to use the (arbitrary) polynomial pattern , 1011. o The information string is shifted left by one position less than the number of positions in the divisor. I = 1001011 (^000) 2 o The remainder is found through modulo 2 division (at right) and added to the information string: 1001011000 2 + 100 2 = 1001011100 2. o If no bits are lost or corrupted, dividing the received information string by the agreed upon pattern will give a remainder of zero. o We see this is so in the calculation at the right. o Real applications use longer polynomials to cover larger information strings.
  • A remainder other than zero indicates that an error has occurred in the transmission.
  • This method work best when a large prime polynomial is used.
  • There are four standard polynomials used widely for this purpose: o CRC-CCITT (ITU-T): X^16 + X^12 + X^5 + 1 o CRC-12: X^12 + X^11 + X^3 + X^2 + X + 1 o CRC-16 (ANSI): X^16 + X^15 + X^2 + 1 o CRC-32: X^32 + X^26 + X^23 + X^22 + X^16 + X^12 + X^11 + X^10 + X^8 + X^7 + X^6 + X^4 + X + 1
  • CRC-32 has been proven that CRCs using these polynomials can detect over 99.8% of all single-bit errors.
  • EXAMPLE 2.39 Using the Hamming code just described and even parity, encode the 8-bit ASCII character K. (The high-order bit will be zero.) Induce a single-bit error and then indicate how to locate the error.

m = 8, we have (8 + r + 1) <= 2 r^ then We choose r = 4 Parity bit at 1, 2, 4, 8

Char K 75 10 = 01001011 2

1 = 1 5 = 1 + 4 9 = 1 + 8 2 = 2 6 = 2 + 4 10 = 2 + 8 3 = 1 + 2 7 = 1 + 2 + 4 11 = 1 + 2 + 8 4 = 4 8 = 8 12 = 4 + 8

We have the following code word as a result: 0 1 0 0 1 1 0 1 0 1 1 0 12 11 10 9 8 7 6 5 4 3 2 1

Parity b1 = b3 + b5 + b7 + b9 + b11 = 1 + 1 + 1 + 0 + 1 = 0 Parity b2 = b3 + b6 + b7 + b10 + b11 = 1 + 0 + 1 + 0 + 1 = 1 Parity b4 = b5 + b6 + b7 + b12 = 1 + 0 + 1 + 0 = 0 Parity b8 = b9 + b10 + b11 + b12 = 0 + 0 + 1 + 0 = 1

Let’s introduce an error in bit position b9, resulting in the code word: 0 1 0 1 1 1 0 1 0 1 1 0 12 11 10 9 8 7 6 5 4 3 2 1

Parity b1 = b3 + b5 + b7 + b9 + b11 = 1 + 1 + 1 + 1 + 1 = 1 (Error, should be 0 ) Parity b2 = b3 + b6 + b7 + b10 + b11 = 1 + 0 + 1 + 0 + 1 = 1 (OK) Parity b4 = b5 + b6 + b7 + b12 = 1 + 0 + 1 + 0 = 0 (OK) Parity b8 = b9 + b10 + b11 + b12 = 1 + 0 + 1 + 0 = 0 (Error, should be 1 )

We found that parity bits 1 and 8 produced an error, and 1 + 8 = 9 , which in exactly where the error occurred.

2.7.3 Reed-Soloman 102

  • If we expect errors to occur in blocks , it stands to reason that we should use an error- correcting code that operates at a block level, as opposed to a Hamming code, which operates at the bit level.
  • A Reed-Soloman (RS) code can be thought of as a CRC that operates over entire characters instead of only a few bits.
  • RS codes, like CRCs, are systematic: The parity bytes are append to a block of information bytes.
  • RS (n, k) code are defined using the following parameters: o s = The number of bits in a character (or “symbol”). o k = The number of s-bit characters comprising the data block. o n = The number of bits in the code word.
  • RS (n, k) can correct (n-k)/2 errors in the k information bytes.
  • Reed-Soloman error-correction algorithms lend themselves well to implementation in computer hardware.
  • They are implemented in high-performance disk drives for mainframe computers as well as compact disks used for music and data storage. These implementations will be described in Chapter 7.

Chapter Summary 103

  • Computers store data in the form of bits, bytes, and words using the binary numbering system.
  • Hexadecimal numbers are formed using four-bit groups called nibbles (or nybbles).
  • Signed integers can be stored in one’s complement, two’s complement, or signed magnitude representation.
  • Floating-point numbers are usually coded using the IEEE 754 floating-point standard.
  • Character data is stored using ASCII, EBCDIC, or Unicode.
  • Error detecting and correcting codes are necessary because we can expect no transmission or storage medium to be perfect.
  • CRC, Reed-Soloman, and Hamming codes are three important error control codes.