Single-molecule analysis of ϕC31 integrase-mediated site-specific ...
The probability of occurrence of specific values in a ...
Transcript of The probability of occurrence of specific values in a ...
1
Department of Geology and GeographyWest Virginia University
Morgantown, WV
The probability of occurrence of specific values in a sample often takes on a bell-shaped appearance as in the case of our pebble mass distribution.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Prob
abili
ty
Pebble masses collected from beach A
150 200 250 300 350 400 450 500 550
Mass (grams)
2
The Gaussian or normal distribution p(x) is a mathematical representation of this bell-shaped characteristic.
]22/2)([22
1)( σ
πσxxexp −−=
]2ˆ2/2)([2ˆ2
1)( sxxes
xp −−=π
This mathematical representation yields a bell shaped curve of probabilities whose form and extent is uniquely defined by the mean and variance derived form a sample.
The Gaussian distribution can also be written using the standard normal variable z.
]2/2[22
1)( zexp −=πσ
The standard normal variable z = (x-x)/σrepresents the number of standard deviations
the value x is from the mean value.
3
Probability Distribution of Pebble Masses
0
0.002
0.004
0.006
0.008
0.01
0 200 400 600 800
Pebble Mass (grams)
Prob
abili
ty
Series1Series2
The Gaussian (normal) distribution of pebble masses looks a bit different from the probability distribution we derived directly from the sample.
]2ˆ2/2)([2ˆ2
1)( smmes
mp −−=π
224 322 353 384242 324 355 386256 324 355 389256 326 355 389265 327 357 393269 329 358 394277 330 359 394283 331 359 395283 331 364 397283 331 366 400284 334 367 401287 335 368 403290 338 369 403294 338 370 403301 338 370 407301 340 371 408302 340 373 409303 341 374 420307 342 374 422307 342 375 423311 343 379 432314 346 380 433317 346 383 435318 350 384 450318 352 384 454
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Prob
abili
ty
Pebble masses collected from beach A
150 200 250 300 350 400 450 500 550
Mass (grams)
4
In the probability histogram, each bar represents a discrete sum of masses over a 50 gram range divided by the total number of the pebbles.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Prob
abili
tyPebble masses collected from beach A
150 200 250 300 350 400 450 500 550
Mass (grams)
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
5
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
6
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Prob
abili
ty
Pebble masses collected from beach A
150 200 250 300 350 400 450 500 550
Mass (grams)
Consider question 7.4 (page 124) of Waltham - In this question Waltham evaluates the equivalent probability that a pebble having a mass somewhere between
401 and 450 gramswill be drawn from a normal distribution having the same mean and standard deviation as the sample.
Note that 401grams lies (401-350)/48 or +1.06 standard deviations from the mean.
450 grams lies (450-350)/48 or +2.08 standard deviations from the mean value.
1.06 and 2.08 are z-values or standard normal representations of the pebble masses associated with this sample. Note that Waltham rounds
off the mean and standard deviation to 350 and 48, respectively.
7
How can we estimate the area between
p (z = 1.06) and p (z=2.08)? Note that area corresponds to the probability that a sample drawn at random from this population will have a value somewhere between 401 and 450 grams.
Note that we can express that area as one half the difference of areas.
This Area
The area we want to find
The areas we get from tables
8
yields -
... one half the combined areas.
X
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
As an example, from our table, we have the areas between ± 1 and ±2 standard deviations from the mean.
9
Areas taken from the table
0.954-0.683 = 0.271This difference equals the sum of two remaining areas: one between -1 and -2 standard deviations from the mean and the other between +1 and +2 standard deviations between the mean.
10
We’re only after the area on the positive side of the bell between 1 and 2 standard deviations - so take 1/2 the difference. Of course, it will work for either side.
Waltham goes through a weighted average determination of the area under the curve between + and - 1.06 standard deviations. He obtains the area 0.71.
Confirm for yourself that the area out to + and - 2.08 is 0.962.
The difference is 0.252.
Now we take one-half of that to get 0.126
Question 7.4 is a little more complicated. We no longer have numbers listed in the table.
11
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
P(±1σ)=0.683
P(±1.1σ)=0.729
P(±1.06σ) is six-tenths of the way from P(1) and P(1.1) or 0.683 plus 0.6 times the difference (0.46)
=0.683 + 0.0276 = 0.71
0.0
0.2
0.4
0.6
0.8
1.0
Prob
abili
ty
0.0 0.5 1.0 1.5 2.0 2.5 3.0
σ
This method of linear interpolation assumes linearity in the curve between 1 and 1.1
Number ofstandarddeviations
AreaNumber ofstandarddeviations
AreaNumber ofstandarddeviations
Area
0.0 0.000 1.1 0.729 2.1 .9640.1 0.080 1.2 0.770 2.2 .9720.2 0.159 1.3 0.806 2.3 .9790.3 0.236 1.4 0.838 2.4 .9840.4 0.311 1.5 0.866 2.5 .9880.5 0.383 1.6 0.890 2.6 .9910.6 0.451 1.7 0.911 2.7 .9930.7 0.516 1.8 0.928 2.8 .9950.8 0.576 1.9 0.943 2.9 .9960.9 0.632 2.0 0.954 3.0 .9971.0 0.683
P(±2σ)=0.954
P(±2.1σ)=0.964
P(±2.08σ) is eight-tenths of the way from P(2) and P(2.1) or 0.954 plus 0.8 times the difference (0.01)
=0.954 + 0.008 = 0.962
0.0
0.2
0.4
0.6
0.8
1.0
Prob
abili
ty
0.0 0.5 1.0 1.5 2.0 2.5 3.0
σ
12
0.126 is the normal probability of obtaining a pebble with mass between 401 and 450 grams from the beach under investigation.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Prob
abili
ty
Pebble masses collected from beach A
150 200 250 300 350 400 450 500 550
Mass (grams)
Note that the value derived from the normal distribution compares nicely with that observed in the sample (0.126 vs. 0.14).
Read section 7.5 carefully and be prepared to confirm the probabilities listed in Table 7.7 from Waltham.
Range (g) Measuredprobability
Range(multiple of s)
Gaussian (normal)probability
201-250 0.02 -3.10 to -2.06 0.019251-300 0.12 -2.06 to -1.02 0.134301-350 0.35 -1.02 to 0.02 0.354351-400 0.36 0.02 to 1.06 0.347401-450 0.14 1.06 to 2.10 0.127451-500 0.01 2.10 to 3.13 0.017
13