The probability of occurrence of specific values in a ...

13
1 tom.h.wilson [email protected] Department of Geology and Geography West Virginia University Morgantown, WV The probability of occurrence of specific values in a sample often takes on a bell-shaped appearance as in the case of our pebble mass distribution. 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 Probability Pebble masses collected from beach A 150 200 250 300 350 400 450 500 550 Mass (grams)

Transcript of The probability of occurrence of specific values in a ...

Page 1: The probability of occurrence of specific values in a ...

1

[email protected]

Department of Geology and GeographyWest Virginia University

Morgantown, WV

The probability of occurrence of specific values in a sample often takes on a bell-shaped appearance as in the case of our pebble mass distribution.

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Prob

abili

ty

Pebble masses collected from beach A

150 200 250 300 350 400 450 500 550

Mass (grams)

Page 2: The probability of occurrence of specific values in a ...

2

The Gaussian or normal distribution p(x) is a mathematical representation of this bell-shaped characteristic.

]22/2)([22

1)( σ

πσxxexp −−=

]2ˆ2/2)([2ˆ2

1)( sxxes

xp −−=π

This mathematical representation yields a bell shaped curve of probabilities whose form and extent is uniquely defined by the mean and variance derived form a sample.

The Gaussian distribution can also be written using the standard normal variable z.

]2/2[22

1)( zexp −=πσ

The standard normal variable z = (x-x)/σrepresents the number of standard deviations

the value x is from the mean value.

Page 3: The probability of occurrence of specific values in a ...

3

Probability Distribution of Pebble Masses

0

0.002

0.004

0.006

0.008

0.01

0 200 400 600 800

Pebble Mass (grams)

Prob

abili

ty

Series1Series2

The Gaussian (normal) distribution of pebble masses looks a bit different from the probability distribution we derived directly from the sample.

]2ˆ2/2)([2ˆ2

1)( smmes

mp −−=π

224 322 353 384242 324 355 386256 324 355 389256 326 355 389265 327 357 393269 329 358 394277 330 359 394283 331 359 395283 331 364 397283 331 366 400284 334 367 401287 335 368 403290 338 369 403294 338 370 403301 338 370 407301 340 371 408302 340 373 409303 341 374 420307 342 374 422307 342 375 423311 343 379 432314 346 380 433317 346 383 435318 350 384 450318 352 384 454

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Prob

abili

ty

Pebble masses collected from beach A

150 200 250 300 350 400 450 500 550

Mass (grams)

Page 4: The probability of occurrence of specific values in a ...

4

In the probability histogram, each bar represents a discrete sum of masses over a 50 gram range divided by the total number of the pebbles.

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Prob

abili

tyPebble masses collected from beach A

150 200 250 300 350 400 450 500 550

Mass (grams)

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

Page 5: The probability of occurrence of specific values in a ...

5

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

Page 6: The probability of occurrence of specific values in a ...

6

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Prob

abili

ty

Pebble masses collected from beach A

150 200 250 300 350 400 450 500 550

Mass (grams)

Consider question 7.4 (page 124) of Waltham - In this question Waltham evaluates the equivalent probability that a pebble having a mass somewhere between

401 and 450 gramswill be drawn from a normal distribution having the same mean and standard deviation as the sample.

Note that 401grams lies (401-350)/48 or +1.06 standard deviations from the mean.

450 grams lies (450-350)/48 or +2.08 standard deviations from the mean value.

1.06 and 2.08 are z-values or standard normal representations of the pebble masses associated with this sample. Note that Waltham rounds

off the mean and standard deviation to 350 and 48, respectively.

Page 7: The probability of occurrence of specific values in a ...

7

How can we estimate the area between

p (z = 1.06) and p (z=2.08)? Note that area corresponds to the probability that a sample drawn at random from this population will have a value somewhere between 401 and 450 grams.

Note that we can express that area as one half the difference of areas.

This Area

The area we want to find

The areas we get from tables

Page 8: The probability of occurrence of specific values in a ...

8

yields -

... one half the combined areas.

X

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

As an example, from our table, we have the areas between ± 1 and ±2 standard deviations from the mean.

Page 9: The probability of occurrence of specific values in a ...

9

Areas taken from the table

0.954-0.683 = 0.271This difference equals the sum of two remaining areas: one between -1 and -2 standard deviations from the mean and the other between +1 and +2 standard deviations between the mean.

Page 10: The probability of occurrence of specific values in a ...

10

We’re only after the area on the positive side of the bell between 1 and 2 standard deviations - so take 1/2 the difference. Of course, it will work for either side.

Waltham goes through a weighted average determination of the area under the curve between + and - 1.06 standard deviations. He obtains the area 0.71.

Confirm for yourself that the area out to + and - 2.08 is 0.962.

The difference is 0.252.

Now we take one-half of that to get 0.126

Question 7.4 is a little more complicated. We no longer have numbers listed in the table.

Page 11: The probability of occurrence of specific values in a ...

11

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

P(±1σ)=0.683

P(±1.1σ)=0.729

P(±1.06σ) is six-tenths of the way from P(1) and P(1.1) or 0.683 plus 0.6 times the difference (0.46)

=0.683 + 0.0276 = 0.71

0.0

0.2

0.4

0.6

0.8

1.0

Prob

abili

ty

0.0 0.5 1.0 1.5 2.0 2.5 3.0

σ

This method of linear interpolation assumes linearity in the curve between 1 and 1.1

Number ofstandarddeviations

AreaNumber ofstandarddeviations

AreaNumber ofstandarddeviations

Area

0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683

P(±2σ)=0.954

P(±2.1σ)=0.964

P(±2.08σ) is eight-tenths of the way from P(2) and P(2.1) or 0.954 plus 0.8 times the difference (0.01)

=0.954 + 0.008 = 0.962

0.0

0.2

0.4

0.6

0.8

1.0

Prob

abili

ty

0.0 0.5 1.0 1.5 2.0 2.5 3.0

σ

Page 12: The probability of occurrence of specific values in a ...

12

0.126 is the normal probability of obtaining a pebble with mass between 401 and 450 grams from the beach under investigation.

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Prob

abili

ty

Pebble masses collected from beach A

150 200 250 300 350 400 450 500 550

Mass (grams)

Note that the value derived from the normal distribution compares nicely with that observed in the sample (0.126 vs. 0.14).

Read section 7.5 carefully and be prepared to confirm the probabilities listed in Table 7.7 from Waltham.

Range (g) Measuredprobability

Range(multiple of s)

Gaussian (normal)probability

201-250 0.02 -3.10 to -2.06 0.019251-300 0.12 -2.06 to -1.02 0.134301-350 0.35 -1.02 to 0.02 0.354351-400 0.36 0.02 to 1.06 0.347401-450 0.14 1.06 to 2.10 0.127451-500 0.01 2.10 to 3.13 0.017

Page 13: The probability of occurrence of specific values in a ...

13