Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Cloud Computing – How did we get here?, Lecture notes of Computer Architecture and Organization

Slides and notes from a lecture on Cloud Computing given by Wes J. Lloyd at the School of Engineering and Technology, University of Washington - Tacoma. The lecture covers topics such as data, thread-level, task-level parallelism, parallel architectures, SIMD architectures, vector processing, multimedia extensions, graphics processing units, speed-up, Amdahl's Law, Scaled Speedup, properties of distributed systems, and modularity. The lecture also introduces Cloud Computing concepts, technology, and architecture. feedback from students and additional resources on MapReduce.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

parolie
parolie 🇺🇸

4.9

(15)

249 documents

1 / 51

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
TCSS 462: Cloud Computing
TCSS 562: Software Engineering for Cloud Computing
School of Engineering and Technology, UW-Tacoma
[Fall 2022]
Slides by Wes J. Lloyd L4.1
Cloud Computing
How did we get here?
Wes J. Lloyd
Sc hool of E ngineer ing and Tech nology
Un iversit y of Wa shington - Tac oma
TCSS 462/562:
(SOFTWARE ENGINEERING
FOR) CLOUD COMPUTING
Qu est ion s f rom 10 /6
Cl oud Com puting H ow did we g et her e?
(M arine scu C h. 2 - 1st e di tio n, Ch. 4 - 2nd e di tio n)
Da ta, thr ead -leve l, t as k- lev el p arall eli sm &
Pa ral lel a rc hit ec tur es
Cl ass Act ivity 1 Imp lic it vs Exp li cit Pa ralle lism
SI MD a rch itectu re s, v ector p rocess ing, mu lti med ia exten sions
Gr ap hic s p roc essin g uni ts
Sp ee d-up, Am dahl's La w, Sca led S peedu p
Pr oper tie s o f d istri bute d syste ms
Mo du lar ity
In trodu ction to Clo ud C omp uti ng lo os ely base d o n book #1:
Cl oud Com puting Conc ep ts, Te chnolo gy & Ar chi tectur e
October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022]
School of Engineering and Technology, University of Washington - Tacoma L4.2
OB JECTIVES 10/11
1
2
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33

Partial preview of the text

Download Cloud Computing – How did we get here? and more Lecture notes Computer Architecture and Organization in PDF only on Docsity!

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma

Cloud Computing –

How did we get here?

Wes J. Lloyd

School of Engineering and Technology

University of Washington - Tacoma

TCSS 462/562:

(SOFTWARE ENGINEERING

FOR) CLOUD COMPUTING

 Questions from 10/  Cloud Computing – How did we get here? (Marinescu Ch. 2 - 1 st^ edition, Ch. 4 - 2 nd^ edition)  Data, thread-level, task-level parallelism & Parallel architectures  Class Activity 1 – Implicit vs Explicit Parallelism  SIMD architectures, vector processing, multimedia extensions  Graphics processing units  Speed-up, Amdahl's Law, Scaled Speedup  Properties of distributed systems  Modularity  Introduction to Cloud Computing – loosely based on book #1: Cloud Computing Concepts, Technology & Architecture October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 2

OBJECTIVES – 10/

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Please classify your perspective on material covered in today’s class (47 respondents):  1 - mostly review, 5-equal new/review, 10-mostly new  Average – 6.89 (  - previous 6.16)  Please rate the pace of today’s class:  1 - slow, 5-just right, 10-fast  Average – 5.62 ( - previous 5.35)  Response rates:  TCSS 462: 25/32 – 78.1%  TCSS 562: 22/26 – 84.6% October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 3 MATERIAL / PACE

 I'm not quite clear on how Bit - Level and Instruction-Level

Parallelism, being implicit, happens "automatically".

 With bit-level parallelism, arithmetic operations that

require multiple instructions to perform on CPUs having

lower word size can be accomplished with a single

instruction on today’s 64-bit CPUs

 Word "size" refers to the amount of data a CPU's internal

data registers can hold and process at one time. Modern

desktop computers have 64-bit words. Computers

embedded in appliances and consumer products have

word sizes of 8, 16 or 32 bits

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 4 FEEDBACK FROM 10/

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Am seeking some clarification for what MAP - REDUCE is besides a framework that uses lots of data processed in parallel. Are cloud computing ser vices built using this infrastructure and then it decides how the work is broken up for ser vers with dif ferent system hardware (heterogeneous, homogeneous, etc.)?  MapReduce is a programming model for writing applications to process vast amounts of data (multi - terabyte data-sets) in parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner  We also consider for data parallelism, data processing tasks that can be sped up using a divide - and-conquer approach  MapReduce provides a programming model and architecture for repeatedly applying the divide - and-conquer pattern October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 7 FEEDBACK - 4  MapReduce consists of two sequential tasks: Map and Reduce. MAP filters and sorts data while converting it into key - value pairs. REDUCE takes this input and reduces its size by performing some kind of summary operation over the dataset  MapReduce drastically speeds up big data tasks by breaking down large datasets and processing them in parallel  MapReduce paradigm was first proposed in 2004 by Google and later incorporated into the open-source Apache Hadoop framework for distributed processing over large datasets using files  Apache Spark supports MapReduce over large datasets in RAM  Amazon Elastic Map Reduce (EMR) provides cloud provider managed services for Apache Hadoop and Spark services October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 8 MAP-REDUCE

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Original Google paper on MapReduce:  https://static.googleusercontent.com/media/research.google. com/en//archive/mapreduce - osdi04.pdf  Apache Spark:  https://spark.apache.org/  Apache Hadoop:  https://hadoop.apache.org/  Amazon Elastic Map Reduce:  https://aws.amazon.com/emr/ October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 9 MAP-REDUCE - ADDITIONAL RESOURCES

 When you speak through the mic in class there's a bit of a

delay and it can be somewhat distracting at times. Would

you be able to change anything about that to minimize the

delay?

 Is this happening on Zoom? Or in the classroom?

 In the classroom I’m able to use the Zoom audio as

output and am able to speak with less microphone

feedback because of the delay (as long as the volume is

not too high)

 I can not use the Zoom audio, but it may be hard to hear

questions asked verbally over Zoom

 This is a work in progress…

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 10 FEEDBACK - 3

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  I ntroduction to Bash Scripting  https://faculty.washington.edu/wlloyd/courses/tcss562/tutorials/T CSS462_562_f2022_tutorial_2.pdf  Review tutorial sections:  Create a BASH webser vice client

  1. What is a BASH script?
  2. Variables
  3. Input
  4. Arithmetic
  5. If Statements
  6. Loops
  7. Functions
  8. User Interface  Call ser vice to obtain IP address & lat/long of computer  Call ser vice to obtain weather forecast for lat/long October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 13 TUTORIAL 2  Questions from 10/  Cloud Computing – How did we get here? (Marinescu Ch. 2 - 1 st^ edition, Ch. 4 - 2 nd^ edition)  Data, thread-level, task-level parallelism & Parallel architectures  Class Activity 1 – Implicit vs Explicit Parallelism  SIMD architectures, vector processing, multimedia extensions  Graphics processing units  Speed-up, Amdahl's Law, Scaled Speedup  Properties of distributed systems  Modularity  Introduction to Cloud Computing – loosely based on book #1: Cloud Computing Concepts, Technology & Architecture October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 14 OBJECTIVES – 10/

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Form groups of ~3 - in class or with Zoom breakout rooms  Each group will complete a MSWORD DOCX worksheet  Be sure to add names at top of document as they appear in Canvas  Activity can be completed in class or after class  The activity can also be completed individually  When completed, one person should submit a PDF of the Google Doc to Canvas  Instructor will score all group members based on the uploaded PDF file  To get started: ▪ Log into your UW Google Account (https://drive.google.com) using you UW NET ID ▪ Follow the link: https://tinyurl.com/tcss462- 562 - a October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 15 ACTIVITY 1  Solutions to be discussed.. October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 16 ACTIVITY 1

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma

 7. For bit-level parallelism, should a developer be

concerned with the available number of virtual CPU

processing cores when choosing a cloud - based virtual

machine if wanting to obtain the best possible speed - up?

(Yes / No)

 8. For instruction-level parallelism, should a developer be

concerned with the physical CPU’s architecture used to

host a cloud-based virtual machine if wanting to obtain

the best possible speed-up? (Yes / No)

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 19 PARALLELISM QUESTIONS

 9. For thread level parallelism (TLP) where a programmer

has spent considerable effort to parallelize their code and

algorithms, what consequences result when this code is

deployed on a virtual machine with too few virtual CPU

processing cores?

 What happens when this code is deployed on a virtual

machine with too many virtual CPU processing cores?

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 20 PARALLELISM QUESTIONS - 2

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Questions from 10/  Cloud Computing – How did we get here? (Marinescu Ch. 2 - 1 st^ edition, Ch. 4 - 2 nd^ edition)  Data, thread-level, task-level parallelism & Parallel architectures  Class Activity 1 – Implicit vs Explicit Parallelism  SIMD architectures, vector processing, multimedia extensions  Graphics processing units  Speed-up, Amdahl's Law, Scaled Speedup  Properties of distributed systems  Modularity  Introduction to Cloud Computing – loosely based on book #1: Cloud Computing Concepts, Technology & Architecture October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 21 OBJECTIVES – 10/  Michael Flynn’s proposed taxonomy of computer architectures based on concurrent instructions and number of data streams (1966)

 SISD (Single Instruction Single Data)

 SIMD (Single Instruction, Multiple Data)

 MIMD (Multiple Instructions, Multiple Data)

 LESS COMMON : MISD (Multiple Instructions, Single Data)

 Pipeline architectures: functional units perform different

operations on the same data

 For fault tolerance, may want to execute same instructions redundantly to detect and mask errors – for task replication October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 22 MICHAEL FLYNN’S COMPUTER ARCHITECTURE TAXONOMY

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  MIMD (Multiple Instructions, Multiple Data) - system with several processors and/or cores that function asynchronously and independently  At any time, different processors/cores may execute different instructions on different data  Multi-core CPUs are MIMD  Processors share memory via interconnection networks ▪ Hypercube, 2D torus, 3D torus, omega network, other topologies  MIMD systems have different methods of sharing memory ▪ Uniform Memory Access (UMA) ▪ Cache Only Memory Access (COMA) ▪ Non-Uniform Memory Access (NUMA) October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 25 FLYNN’S TAXONOMY - 2

 Arithmetic intensity: Ratio of work (W) to

memory traffic r/w (Q)

Example: # of floating point ops per byte of data read

 Characterizes application scalability with SIMD support

SIMD can perform many fast matrix operations in parallel

 High arithmetic Intensity:

P rograms with dense matrix operations scale up nicely

(many calcs vs memory RW, supports lots of parallelism)

 Low arithmetic intensity:

Programs with sparse matrix operations do not scale well

with problem size

(memory RW becomes bottleneck, not enough ops!)

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 26 ARITHMETIC INTENSITY

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  When program reaches a given arithmetic intensity performance of code running on CPU hits a “roof”  CPU performance bottleneck changes from: memory bandwidth (left) → floating point performance (right) October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 27 ROOFLINE MODEL

Key take-aways:

When a program’s has low

Arithmetic Intensity, memory

bandwidth limits performance..

With high Arithmetic intensity,

the system has peak parallel

performance…

→ performance is limited by??

 Questions from 10/  Cloud Computing – How did we get here? (Marinescu Ch. 2 - 1 st^ edition, Ch. 4 - 2 nd^ edition)  Data, thread-level, task-level parallelism & Parallel architectures  Class Activity 1 – Implicit vs Explicit Parallelism  SIMD architectures, vector processing, multimedia extensions  Graphics processing units  Speed-up, Amdahl's Law, Scaled Speedup  Properties of distributed systems  Modularity  Introduction to Cloud Computing – loosely based on book #1: Cloud Computing Concepts, Technology & Architecture October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 28 OBJECTIVES – 10/

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  Parallel hardware and software systems allow:

▪ Solve problems demanding resources not available on

single system.

▪ Reduce time required to obtain solution

 The speed-up (S) measures effectiveness of parallelization: S(N) = T(1) / T(N)

T(1) → execution time of total sequential computation

T(N) → execution time for performing N parallel

computations in parallel

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 31 PARALLEL COMPUTING  Consider embarrassingly parallel image processing  Eight images (multiple data)  Apply image transformation (greyscale) in parallel  8 - core CPU, 16 hyperthreads  Sequential processing: perform transformations one at a time using a single program thread ▪ 8 images, 3 seconds each: T(1) = 24 seconds  Parallel processing ▪ 8 images, 3 seconds each: T(N) = 3 seconds  Speedup: S(N) = 24 / 3 = 8x speedup  Called “per fect scaling”  Must consider data transfer and computation setup time October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 32 SPEED-UP EXAMPLE

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma

 Amdahl’s law is used to estimate the speed - up of a job

using parallel computing

1. Divide job into two parts

2. Part A that will still be sequential

3. Part B that will be sped-up with parallel computing

 Portion of computation which cannot be parallelized will

determine (i.e. limit) the overall speedup

 Amdahl’s law assumes jobs are of a fixed size

 Also, Amdahl’s assumes no overhead for distributing the

work, and a perfectly even work distribution

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 33 AMDAHL’S LAW  S = theoretical speedup of the whole task  f= fraction of work that is parallel (ex. 25% or 0.25)  N= proposed speed up of the parallel part ( ex. 5 times speedup )  % improvement of task execution = 100 * (1 – (1 / S))  Using Amdahl’s law, what is the maximum possible speed - up? October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 34 AMDAHL’S LAW

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma

 Calculates the scaled speed-up using “N” processors

S(N) = N + (1 - N) α

N: Number of processors

α: fraction of program run time which can’t be parallelized

(e.g. must run sequentially)

Can be used to estimate runtime of parallel portion of program

 Where α =  / ( + )

 Where = sequential time,  =parallel time

 Our Amdahl’s example: = 3s,  =1s, α =.

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 37 GUSTAFSON'S LAW

 Calculates the scaled speed-up using “N” processors

S(N) = N + (1 - N) α

N: Number of processors

α: fraction of program run time which can’t be parallelized

(e.g. must run sequentially)

 Example:

Consider a program that is embarrassingly parallel,

but 75% cannot be parallelized. α=.

QUESTION: If deploying the job on a 2 - core CPU, what

scaled speedup is possible assuming the use of two

processes that run in parallel?

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 38 GUSTAFSON'S LAW

TCSS 562: Software Engineering for Cloud Computing School of Engineering and Technology, UW-Tacoma  QUESTION: What is the maximum theoretical speed - up on a 2 - core CPU?

S(N) = N + (1 - N) α

N=2, α=.

S(N) = 2 + (1 - 2).

S(N) =?

 What is the maximum theoretical speed - up on a 16-core CPU?

S(N) = N + (1 - N) α

N=16, α=.

S(N) = 16 + (1 - 16).

S(N) =?

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 39 GUSTAFSON’S EXAMPLE  QUESTION: What is the maximum theoretical speed - up on a 2 - core CPU?

S(N) = N + (1 - N) α

N=2, α=.

S(N) = 2 + (1 - 2).

S(N) =?

 What is the maximum theoretical speed - up on a 16 - core CPU?

S(N) = N + (1 - N) α

N=16, α=.

S(N) = 16 + (1 - 16).

S(N) =?

October 11, 2022 TCSS462/562:(Software Engineering for) Cloud Computing [Fall 2022] School of Engineering and Technology, University of Washington - Tacoma L4. 40 GUSTAFSON’S EXAMPLE For 2 CPUs, speed up is 1.25x For 16 CPUs, speed up is 4.75x For 2 CPUs, speed up is 1.25x For 16 CPUs, speed up is 4.75x