Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest...

28
Summary: Parameters of a Probability Distribution Probability Mass Function (PMF): f(x)=Prob(X=x) Cumulative Distribution Function (CDF): F(x)=Prob(X≤x) Complementary Cumulative Distribution Function (CCDF): F > (x)=Prob(X>x) The mean, μ=E[X], is a measure of the center of mass of a random variable The variance, V(X)=E[(X‐ μ) 2 ], is a measure of the dispersion of a random variable around its mean The standard deviation, σ=[V(X)] 1/2 , is another measure of the dispersion around mean. Has the same units as X The skewness, γ1=E[(X‐μ) 3 3 ], a measure of asymmetry around mean The geometric mean, exp(E[log X]) is useful for very broad distributions 1

Transcript of Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest...

Page 1: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Summary: Parameters of a Probability Distribution• Probability Mass Function (PMF): f(x)=Prob(X=x)• Cumulative Distribution Function (CDF): F(x)=Prob(X≤x)• Complementary Cumulative Distribution Function (CCDF): 

F>(x)=Prob(X>x)• The mean, μ=E[X], is a measure of the 

center of mass of a random variable• The variance, V(X)=E[(X‐ μ)2], is a measure of the dispersion 

of a random variable around its mean• The standard deviation, σ=[V(X)]1/2, is another measure of 

the dispersion around mean. Has the same units as X• The skewness, γ1=E[(X‐μ)3/σ3], a  measure of asymmetry 

around mean• The geometric mean, exp(E[log X]) is useful for very broad 

distributions1

Page 2: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

A gallery of usefuldiscrete probability distributions

Page 3: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Discrete Uniform Distribution• Simplest discrete distribution. • The random variable X assumes only a finite number of values, each with equal probability.

• A random variable X has a discrete uniform distribution if each of the n values in its range, say x1, x2, …, xn, has equal probability.

f(xi) = 1/n

3

Page 4: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Uniform Distribution of Consecutive Integers

• Let X be a discrete uniform random variable all integers from a to b (inclusive). There are b – a +1 integers.  Therefore each one gets:

f(x) = 1/(b‐a+1)• Its measures are:

μ = E(x) = (b+a)/2σ2 = V(x) = [(b‐a+1)2–1]/12           

Note that the mean is the midpoint of a & b.

4

Page 5: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

5

A random variable X has the same probability for integer numbers 

x =1:10 What is the behavior of its Probability

Mass Function (PMF): P(X=x)?A. does not change with x=1:10B. linearly  increases with x=1:10 C. linearly  decreases with x=1:10 D. is a quadratic function of x=1:10

Get your i‐clickers

Page 6: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

6

A random variable X has the same probability for integer numbers 

x =1:10 What is the behavior of its Cumulative Distribution Function (CDF): P(X≤x)?

A. does not change with x=1:10B. linearly  increases with x=1:10 C. linearly  decreases with x=1:10 D. is a quadratic function of x=1:10

Get your i‐clickers

Page 7: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

7

A random variable X has the same probability for integer numbers 

x =1:10 What is its mean value?

A. 0.5B. 5.5C. 5D. 0.1

Get your i‐clickers

Page 8: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

8

A random variable X has the same probability for integer numbers 

x =1:10 What is its skewness?

A. 0.5B. 1C. 0D. 0.1

Get your i‐clickers

Page 9: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Matlab exercise: Uniform distribution• Generate a sample of size 100,000 for uniform random variable X taking values 1,2,3,…10

• Plot the approximation to the probability mass function based on this sample

• Calculate mean and variance of this sample and compare it to infinite sample predictions:E[X]=(a+b)/2 and V[X]=((a‐b+1)2‐1)/12

Page 10: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Matlab template: Uniform distribution• b=10; a=1; % b= upper bound; a= lower bound (inclusive)'• Stats=100000; % sample size to generate• r1=rand(Stats,1); • r2=floor(??*r1)+??; • mean(r2)• var(r2)• std(r2)• [hy,hx]=hist(r2, 1:10); % hist generates histogram in bins 

1,2,3...,10• % hy ‐ number of counts in each bin; hx ‐ coordinates of 

bins• p_f=hy./??; % normalize counts to add up to 1• figure; plot(??,p_f, 'ko‐'); ylim([0, max(p_f)+0.01]); % plot 

the PMF

Page 11: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Matlab exercise: Uniform distribution• b=10; a=1; % b= upper bound; a= lower bound (inclusive)'• Stats=100000; % sample size to generate• r1=rand(Stats,1); • r2=floor(b*r1)+a; • mean(r2)• var(r2)• std(r2)• [hy,hx]=hist(r2, 1:10); % hist generates histogram in bins 

1,2,3...,10• % hy ‐ number of counts in each bin; hx ‐ coordinates of 

bins• p_f=hy./sum(hy); % normalize counts to add up to 1• figure; plot(hx,p_f, 'ko‐'); ylim([0, max(p_f)+0.01]); % plot 

the PMF

Page 12: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Bernoulli distribution

Jacob Bernoulli (1654‐1705)Swiss mathematician (Basel)

• Law of large numbers• Mathematical constant e=2.718…

The simplest non‐uniform distributionp – probability of success (1)1‐p – probability of failure (0)

Page 13: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,
Page 14: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Bernoulli distribution

Page 15: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Refresher: Binomial Coefficients

103

Number of ways to choose k objects out of n and where the .

Called bin

! , called cho

o

o

10 10! 10 9

h

7

w

3

i

7

t out replacement rder does not

s

m

8 7! 120 3 ! ! 3 2

e

att r

1 !

!

e

!( )nk

n

C

nC n kk k n k

0

omial coefficients because of the binomial formula

( ) ( ) ( )... ( )n

n n x n xx

xp q p q p q p q C p q

Page 16: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,
Page 17: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Binomial Distribution• Binomially‐distributed random variable X equals sum (number of successes) of n independent Bernoulli trials 

• The probability mass function is:

• Based on the binomial expansion:

Sec  3=6 Binomial Distribution 17

1 for 0,1,... (3-7)n xn xxf x C p p x n

Page 18: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Binomial Mean and VarianceX is a binomial random variable with parameters p and n

Mean:μ = E(X) = np

Variance:  σ2 = V(X) = np(1‐p)

Standard deviation:

σ =  np(1−p)Sec  3=6 Binomial Distribution 18

Page 19: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Matlab exercise: Binomial distribution• Generate a sample of size 100,000 for binomially‐distributed random variable X with n=100, p=0.2

• Tip: generate n Bernoulli random variables and use sum to add them up

• Plot the approximation to the Probability Mass Function based on this sample

• Calculate the mean and variance of this sample and compare it to theoretical calculations:E[X]=n*p and V[X]=n*p*(1‐p)

Page 20: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Matlab template: Binomial distribution• n=100; p=0.2;• Stats=100000;• r1=rand(Stats,n) ?? < or > ?? p;• r2=sum(r1, ?? 1 rows or 2 columns ?? ); • mean(r2)• var(r2)• [a,b]=hist(r2, 0:n);• p_b=??./sum(??);• figure; stem(??,p_b);• figure; semilogy(??,p_b,'ko‐')

Page 21: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Credit: XKCD comics 

Page 22: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

“Secretary” or a “picky bride” problem

• A future bride has a known and large number – n – of suitors whom she dates one at a time

• She can easily evaluate and rank her suitors relative to each other but has no idea of the overall distribution

• She has only one chance to choose her husband and cannot go back to the guys she dumped

• How can she maximize the probability to choose the best groom among n suitors?

Page 23: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Eugene Dynkin (1924 – 2014) solved picky bride problem in 1963.was a Russian and American mathematician, member of NAS.He has made contributions to the fields of probability and algebra. The Dynkin diagram, the Dynkinsystem, and Dynkin's lemma are all named after him.

Martin Gardner (1914 – 2010) Described the secretary problem in Scientific American 1960.was an American popular mathematics and popular science writer.  Best known for “recreational mathematics”:He was behind the “Mathematical Games” section in Scientific American.

Page 24: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

What should the future bride do?• She does not know the distribution of the quality of suitors. She has to learn it on the fly

• Algorithm: look at the first r suitors, remember the best among them

• Marry the first among next n‐r suitors who is better than the best among the first r suitors

• How to choose r?• r small – not enough information: the best among r is not very good. You are likely to marry a looser.

• r=n‐1 – you procrastinated too long! You have almost all information but you will have to marry the last schmuck who is (likely) not very good

Page 25: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,
Page 26: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

dP(x)/dx=‐ln(x)‐1‐ln(x*)‐1=0

x*=1/e=0.3679

Probability of picking the best is also 1/e=0.3679

Page 27: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Bonus matlab exercise: Picky bride• Generate a sample of size 100 of suitors with quality given by random number between 0 and 1

• Let bride select the first suitor exceeding the best suitor among the first – 10 suitors (not enough information)– 37 suitors (optimal)– 90 suitors (too much procrastination)

• Compare to the best suitor among all 100• Repeat for 100,000 brides and calculate the probability that she picked the best candidate in each case

• Test: the probability of picking the best suitor after dating 37 suitors the answer should be around 1/e=0.37. In all other cases ‐ smaller

Page 28: Summary: Parameters of a Probability Distribution...Discrete Uniform Distribution •Simplest discrete distribution. •The random variable Xassumes only a finite number of values,

Credit: XKCD comics