Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Understanding Random Data: Self-Similarity and Exponential Arrivals, Slides of Computer Science

Indian Institute of Forest Management Computer Science

The concept of random data through a simulation example, focusing on self-similarity and exponential arrivals. It discusses the differences in results for various processes and compares them with theoretical distributions. The document also covers data generation and the importance of inverse functions.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

divyaa 🇮🇳

4.4

(59)

71 documents

1 / 39

This page cannot be seen from the preview

Don't miss anything!

Performance

Engineering

Looking at Random Data &

A Simulation Example

Docsity.com

Partial preview of the text

Download Understanding Random Data: Self-Similarity and Exponential Arrivals and more Slides Computer Science in PDF only on Docsity!

Performance

Engineering

Looking at Random Data &

A Simulation Example

Goals:

1. Look at the nature of random data. What happens as

random data is used in multiple operations?

2. Look at how network arrivals really work – are arrivals

random or do they follow some other pattern?

3. Use our simulation techniques to study these patterns

(so this is really an example of simulation usage).

4. Determine the difference in behavior as a result of

network arrival patterns.

Random Data

1. Let’s take a very simple piece of code:

if ( random() >= 0.5 )

HeadsGreaterThanTails++;

else

HeadsGreaterThanTails--;

2. When we run the program, we collect the value of the variable every

100 million iterations – and do it for a total of 1 billion iterations.

3. Here’s a sample run.

Iterations Proc 0 100,000,000 - 200,000,000 - 300,000,000 5141 400,000,000 3197

500,000,000 - 600,000,000 - 700,000,000 - 800,000,000 - 900,000,000 - 1,000,000,000 -

After 400 million iterations, there were 3192 more “heads” than “tails”.

Random Data

1. Now lets do that same thing for 8 processes

2. What do you think will happen to the numbers?

Will some process always have more heads than tails?
Will the difference between results for processes depend on how many

iterations have been done?

3. Here’s the result for 8 processes:

Iterations Proc 0 Proc 1 Proc 2 Proc 3 Proc 4 Proc 5 Proc 6 Proc 7 100,000,000 -10299 -9319 -1063 6743 8633 -4421 8123 - 200,000,000 -4245 -10227 3657 -23059 24885 -26655 25865 - 300,000,000 5141 -6819 255 -20175 14469 -33389 27077 -

400,000,000 3197 -8155 -5379 -6633 27387 -50509 24531 2339 500,000,000 -1313 -10547 -153 -14679 29335 -51963 23097 - 600,000,000 -25941 -29847 -26371 5027 32857 -49505 27089 - 700,000,000 -24093 -26331 -43401 13153 24471 -26899 4561 -

800,000,000 -24661 -35315 -31233 41 20425 -11861 13837 - 900,000,000 -27123 -33049 -44461 -11769 -3283 -12477 15865 -

1,000,000,000 -23997 -15483 -44535 22889 -8447 -13671 15743 6023

Random Data

As you can see in the last graph, the statistics are terrible – it’s hard to

determine the pattern for multiple runs.

So the program was run 10,000 times. And the minimum and maximum

count was taken at each time interval for those 10,000 runs.

The Max and Min values in All Runs

50000

100000

150000

200000

250000

0 200000000 400000000 600000000 800000000 1000000000 1200000000 Iterations

Max/Min + 125000

Min of all runs Max of all runs

Random Data

But, what happens if the processes doing random events interact with each

other?

This is the case if the programs are all accessing the same disk – we randomly

choose which block in a large file is being written to. But each process

must compete for the file lock and for disk access.

Here’s the behavior of 10 disk-writing processes for 10,000 seconds. The

numbers represent disk writes for that process during the time interval.

Secs Proc 0 Proc 1 Proc 2 Proc 3 Proc 4 Proc 5 Proc 6 Proc 7 Proc 8 Proc 9

1000 21660 21650 21810 21800 21790 21720 21850 21740 21640 21730

2000 43000 42960 43080 43120 43220 42960 43190 43110 42900 43080

3000 64790 64650 64850 64930 65060 64680 64900 64860 64770 64940

4000 86610 86450 86620 86680 86750 86530 86640 86660 86560 86690

5000 108450 108280 108370 108450 108520 108410 108480 108380 108400 108580

6000 130010 129860 129990 129950 129980 130050 130090 130010 129910 130080

7000 151730 151600 151710 151730 151730 151770 151750 151820 151750 151800

8000 173340 173340 173400 173640 173480 173400 173520 173660 173470 173500

9000 194950 195050 195010 195300 195090 195000 195230 195440 195130 195150

10000 216760 216880 216780 217140 216860 216740 216990 217240 216880 216960

Random Data

Comparing the 10 processes. This is the spread (difference) of the maximum

less the minimum accesses for the process.

Disk Access Rates With Time - It is (Max Access - Min Access)

100

200

300

400

500

600

0 2000 4000 6000 8000 10000 12000 Time (Seconds)

Difference In Accesses

Random Data

Comparing the 10 processes. Here’s how their relative performance varies over

time. Note that no one process is always the minimum or the maximum

performer.

Process Writes - How they deviate from the minimum value

100

200

300

400

500

600

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Tim e (Seconds)

Process Writes compared to minimum process

Proc 0 Proc 1 Proc 2 Proc 3 Proc 4 Proc 5 Proc 6 Proc 7 Proc 8 Proc 9

Another Numerical Example

X

A A

B B

Another Numerical Example

// ////////////////////////////////////////////////////////////////////// // We're trying to solve the following problem. // Given two circles, how close should the centers of the circles be such // that the area subtended by the arcs of the two circles is exactly one // half the total area of the circle. // // See example 2.3.8 in Leemis & Park. // We use the book's definition for Uniform - see 2.3. // Here's how this works. Try a number of different distances between // the two circle centers. Then for the ones that are most successful, // zoom in to do them in more detail. // ////////////////////////////////////////////////////////////////////// #include <math.h> #include <stdlib.h> #define PI 3. #define TRUE 1 #define FALSE 0

// Prototypes double GetRandomNumber( void ); void InitializeRandomNumber( ); double ModelTwoCircles( double, int ); double Uniform( double min, double max) { return( min + (max - min)*GetRandomNumber() ); }

double ModelTwoCircles ( double Distance, int NumberOfSamples ) { double HitsInOneCircle = 0, HitsInTwoCircles = 0; double x, y, SecondDistance; int Samples; for ( Samples = 0; Samples < NumberOfSamples; Samples++ ) { do { x = Uniform( -1, 1 ); y = Uniform( -1, 1 ); } while ( (x * x) + (y * y) >= 1 ); // Loop until value in circle

HitsInOneCircle++; SecondDistance = sqrt( ( x - Distance ) * (x - Distance ) + (y * y) ); if ( SecondDistance < 1.0 ) { HitsInTwoCircles++; // printf( "Samples: Second Distance = %8.6f\n", SecondDistance ); } } // End of for return( HitsInTwoCircles / HitsInOneCircle ); }

Network Arrivals

1. In our queueing analysis, we’ve assumed random arrivals (Poisson

distribution, with exponentially distributed inter-arrival times.)

2. This leads to our analysis of M/M/1 queues with

Utilization = Service Time/Arrival Time and with
Queue Length = U / ( 1 – U ).

3. We generated uniformly distributed random numbers and based on

those were able to derive the exponential arrival times and Poisson

distributions.

But is this how networks behave?

Random Arrivals

What Did Leland et. al Measure?

Millions of packets from many workstations, as recorded on Bellcore internal networks.

Understanding Random Data: Self-Similarity and Exponential Arrivals, Slides of Computer Science

Related documents

Partial preview of the text

Download Understanding Random Data: Self-Similarity and Exponential Arrivals and more Slides Computer Science in PDF only on Docsity!

Performance

Engineering

Looking at Random Data &

A Simulation Example

Goals:

1. Look at the nature of random data. What happens as

random data is used in multiple operations?

2. Look at how network arrivals really work – are arrivals

random or do they follow some other pattern?

3. Use our simulation techniques to study these patterns

(so this is really an example of simulation usage).

4. Determine the difference in behavior as a result of

network arrival patterns.

1. Let’s take a very simple piece of code:

if ( random() >= 0.5 )

HeadsGreaterThanTails++;

else

HeadsGreaterThanTails--;

2. When we run the program, we collect the value of the variable every

100 million iterations – and do it for a total of 1 billion iterations.

3. Here’s a sample run.

1. Now lets do that same thing for 8 processes

2. What do you think will happen to the numbers?

iterations have been done?

3. Here’s the result for 8 processes:

As you can see in the last graph, the statistics are terrible – it’s hard to

determine the pattern for multiple runs.

So the program was run 10,000 times. And the minimum and maximum

count was taken at each time interval for those 10,000 runs.

But, what happens if the processes doing random events interact with each

other?

This is the case if the programs are all accessing the same disk – we randomly

choose which block in a large file is being written to. But each process

must compete for the file lock and for disk access.

Here’s the behavior of 10 disk-writing processes for 10,000 seconds. The

numbers represent disk writes for that process during the time interval.

Comparing the 10 processes. This is the spread (difference) of the maximum

less the minimum accesses for the process.

Comparing the 10 processes. Here’s how their relative performance varies over

time. Note that no one process is always the minimum or the maximum

performer.

X

X

A A

B B

1. In our queueing analysis, we’ve assumed random arrivals (Poisson

distribution, with exponentially distributed inter-arrival times.)

2. This leads to our analysis of M/M/1 queues with

3. We generated uniformly distributed random numbers and based on

those were able to derive the exponential arrival times and Poisson

distributions.

But is this how networks behave?

Random Arrivals

Significance of self-similarity

traffic study provides insights into traffic generated by individual

users. Nature of congestion produced by self-similar models differs

drastically from that predicted by standard formal models. We will

show this by the simulation we perform here.

Why is Ethernet traffic self-similar?

(People don’t generate traffic randomly. They come to work at the

same time, get tired at the same time, etc.)

Mathematical Result

periods have high variability or infinite variance produces aggregate

network traffic that is self-similar or long range independent.

(Infinite variance here means that there are some samples with a very

long inter-arrival time (lunch hour is a very long time!)