Download - Intro probability 2

Transcript
Page 1: Intro probability 2

Probability Theory

Random Variables

Phong [email protected]

September 11, 2010

– Typeset by FoilTEX –

Page 2: Intro probability 2

Random Variables

Definition 1. A random variable is a mapping X : S → R that associatesa unique numerical value X(ω) to each outcome ω.

Letting X denote the random variable that is defined as the sum of twofair dice, then

P{X = 2) = P ({(1, 1))) =1

36,

P{X = 3) = P ({(1, 2), (2, 1))) =2

36,

P{X = 4) = P ({(1, 3), (2, 2), (3, 1))) =3

36

– Typeset by FoilTEX – 1

Page 3: Intro probability 2

Distribution Functions and Probability Functions

Definition 2. The cumulative distribution function CDF FX : R → [0, 1]of a r.v X is defined by

FX(x) = P (X ≤ x).

Example 1. Flip a fair coin twice and let X be the number of heads. ThenP (X = 0) = P (X = 2) = 1/4 and P (X = 1) = 1/2. The distributionfunction is

– Typeset by FoilTEX – 2

Page 4: Intro probability 2

FX(x) =

0 x < 0

1/4 0 ≤ x ≤ 1

3/4 1 ≤ x ≤ 2

1 x ≥ 2.

– Typeset by FoilTEX – 3

Page 5: Intro probability 2

Discrete Random Variables

Definition 3. X is discrete if it takes countably many values {x1, x2, . . .}.

We define the probability mass function p(a) or probability function forr.v X by

fX(x) = P (X = x)

Thus, fX(x) ≥ 0 ∀x ∈ R and∑∞i=1

p(xi) = 1. The CDF of X isrelated to fX by

FX(x) = P (X ≤ x) =∑

all xi≤x

fX(xi)

– Typeset by FoilTEX – 4

Page 6: Intro probability 2

The Bernoulli Random Variable

Suppose that a trail (or an experiment), whose outcome can be classifiedas either a ”‘success”’ or as a ”‘failure”’ is performed. If we let X equal 1if the outcome is a success and 0 if it is a failure, then the probability massfunction of X is given by

p(0) = P (X = 0) = 1− p (1)

p(1) = P (X = 1) = p (2)

where p, 0 ≤ p ≤ 1, is the probability that the trial is a ”‘success”’

– Typeset by FoilTEX – 5

Page 7: Intro probability 2

The Binomial Random Variable

• Suppose that n independent trials, each of which results in a ”‘success”’with probability p and in a ”‘failure”’ with probability 1− p.

• If X represents the number of successes that occur in the n trials, thenX is said to be a binomial random variable with parameters (n, p)

• The probability mass function of a binomial random variable is given by

p(i) =

(ni

)pi(1− p)n−i, i = 0, 1, . . . , n

– Typeset by FoilTEX – 6

Page 8: Intro probability 2

Example 2. Four fair coins are flipped. If the outcomes are assumedindependent, what is the probability that two heads and two tails areobtained?

Example 3. It is known that all items produced by a certain machine willbe defective with probability 0.1, independently of each other. What is theprobability that in a sample of three items, at most one will be defective?

– Typeset by FoilTEX – 7

Page 9: Intro probability 2

The Geometric Random Variable

• Suppose that independent trials, each having a probability p of being asuccess, are performed until a success occurs.

• Let X be the number of trails required until the first success, then X issaid to be a geometric random variable with parameter p.

• Its probability mass function is given by

p(n) = P (X = n) = (1− p)n−1p, n = 1, 2, . . .

– Typeset by FoilTEX – 8

Page 10: Intro probability 2

The Poisson Random Variable

A random variable X, taking on one of the values 0, 1, 2, . . . is said tobe a Poisson random variable with parameter λ, if for some λ > 0,

p(i) = P (X = i) = e−λλi

i!, i = 0, 1, . . .

– Typeset by FoilTEX – 9

Page 11: Intro probability 2

Continuous Random Variables

Definition 4. A r.v X is is continuous if there exists a function fX suchthat fX(x) ≥ 0∀x,

∫∞−∞ fX(x)dx = 1 and for every a ≤ b,

P (a < X < b) =

∫ b

a

fX(x)dx

The function fX is called the probability density function(PDF). Wehave that

FX(x) =

∫ x

−∞fX(t)dt

and fX(x) = F ′X(x) at all points x at which FX is differentiable.

– Typeset by FoilTEX – 10

Page 12: Intro probability 2

• If X is continuous then P (X = x) = 0∀x

• f(x) is different from P (X = 0)inthecontinuouscase

• a PDF can be bigger than 1 (unlike a mass function)

f(x) =

{5 x ∈ [0, 1

5]

0 o.w

then f(x) ≥ 0 and∫f(x)dx = 1 so this is a well-defined PDF even

though f(x) = 5 in some places.

– Typeset by FoilTEX – 11

Page 13: Intro probability 2

Lemma 1. Let F be the CDF for a r.v X. Then:

1. P (X = x) = F (x)− F (x−) where F (x−) = limy↑xF (y),

2. P (x < X ≤ y) = F (y)− F (x),

3. P (X > x) = 1− F (x),

4. If X is continuous then

P (a < X < b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a ≤ X ≤ b)

– Typeset by FoilTEX – 12

Page 14: Intro probability 2

The Uniform Random Variable

An random variable is said to be uniformly distributed over the interval(0, 1) if its probability density function is given by

f(x) =

{1, 0 ≤ x ≤ 1

0, otherwise

In general case,

f(x) =

{1

β−α, α ≤ x ≤ β0, otherwise

– Typeset by FoilTEX – 13

Page 15: Intro probability 2

Example 4. Calculate the cumulative distribution function of a randomvariable uniformly distributed over (α, β).

– Typeset by FoilTEX – 14

Page 16: Intro probability 2

Exponential Random Variables

A continuous random variable whose probability density function is given,for some λ > 0, by

f(x) =

{λeλx, if x ≥ 0

0, if x ≤ 0

is said to be an exponential random variable with parameter λ.

– Typeset by FoilTEX – 15

Page 17: Intro probability 2

Gamma Random Variables

A continuous random variable whose density is given by

f(x) =

{λeλx(λx)α−1

Γ(α) , if x ≥ 0

0, if x ≤ 0

for some λ > 0, α > 0 is said to be a gamma random variable withparameter α, λ. The quantity Γ(α) is called the gamma function and isdefined by

Γ(α) =

∫ ∞0

e−xxα−1dx

– Typeset by FoilTEX – 16

Page 18: Intro probability 2

Normal Random Variables

X is a normal random variable with parameters (µ, σ2) if the density ofX is given by

f(x) =1√2πσ

e−(x−µ)2/2σ2−∞ ≤ x ≤ ∞ (3)

– Typeset by FoilTEX – 17

Page 19: Intro probability 2

Remarks

• Read X ∼ F as ”‘X has distribution F”’.

• X is a r.v; x denotes a particular value of the r.v; n and p (i.e Binomialdistribution) are parameters, that is, fixed real numbers. Parameters isusually unknown and must be estimated from data.

• In practice, we think of r.v like a random number but formally it is amapping defined on some sample space.

– Typeset by FoilTEX – 18

Page 20: Intro probability 2

Jointly Distributed Random Variables

Given a pair of discrete r.vs X and Y , define the joint mass function byf(x, y) = P (X = x, Y = y).

Definition 5. In the continuous case, we call a function f(x, y) a pdf forthe r.vs (X,Y ) if

1. f(x, y) ≥ 0 ∀(x, y),

2.∫∞−∞

∫∞−∞ f(x, y)dxdy = 1 and, for any set A ⊂ R × R, P ((X,Y ) ∈

A) =∫ ∫

Af(x, y)dxdy.

In the discrete or continuous case we define the joint CDF asFX,Y (x, y) = P (X ≤ x, Y ≤ y).

– Typeset by FoilTEX – 19

Page 21: Intro probability 2

Example 5. At a party N men throw their hats into the center of aroom. The hats are mixed up and each man randomly selects one. Findthe expected number of men that select their own hats.

Example 6. Suppose there are 25 different types of coupons and supposethat each time one obtains a coupon, it is equally likely to be any one ofthe 25 types. Compute the expected number of different types that arecontained in a set of 10 coupons.

– Typeset by FoilTEX – 20

Page 22: Intro probability 2

Marginal Distributions

Definition 6. If (X,Y ) have a joint distribution with mass function fX,Y ,then the marginal mass function for X is defined by

fX(x) = P (X = x) =∑y

P (X = x, Y = y) =∑y

f(x, y)

and the marginal mass function for Y is defined by

fY (y) = P (Y = y) =∑x

P (X = x, Y = y) =∑x

f(x, y)

– Typeset by FoilTEX – 21

Page 23: Intro probability 2

Example 7. Calculate the marginal distributions for X and Y from tablebelow

Y=0 Y=1X=0 1/10 2/10 3/10X=1 3/10 4/10 7/10

4/10 6/10

Definition 7. For continuous r.vs, the marginal densities are

fX(x) =

∫f(x, y)dy and fY (y) =

∫f(x, y)dx

The corresponding marginal distribution functions are denoted by FXand FY .

Example 8. Suppose that

– Typeset by FoilTEX – 22

Page 24: Intro probability 2

f(x, y) =

{x+ y if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1

0 otherwise

Then

fY (y) =

∫ 1

0

(x+ y)dx =

∫ 1

0

xdx+

∫ 1

0

ydx =1

2+ y.�

– Typeset by FoilTEX – 23

Page 25: Intro probability 2

Independent Random Variables

Definition 8. Two r.vs X and Y are said to be independent if, for everyA and B,

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)

Theorem 1. Let X and Y have joint pdf fX,Y . Then X and Y areindependent is and only if fX,Y (x, y) = fX(x)fY (y) ∀x, y.

Example 9. Suppose that X and Y are independent and both have thesame density

– Typeset by FoilTEX – 24

Page 26: Intro probability 2

f(x) =

{2x if 0 ≤ x ≤ 1

0 otherwise

Let find P (X + Y ≤ 1)?

Theorem 2. Suppose that the range of X and Y is a rectangle (possiblyinfinite). If f(x, y) = g(x)h(y) for some functions g and h (not necessarilyprobability density functions) then X and Y are independent.

Example 10. Let X and Y have density

f(x, y) =

{2e−(x+2y) if x > 0 and y > 0

0 otherwise.

The range of X and Y is the rectangle (0,∞)× (0,∞). We can write

– Typeset by FoilTEX – 25

Page 27: Intro probability 2

f(x, y) = g(x)h(y) where g(x) = 2e−x and h(y) = e−2y. Thus, X and Yare independent. �

– Typeset by FoilTEX – 26

Page 28: Intro probability 2

Conditional Distributions

• One of the most useful concepts in probability theory

• We are often interested in calculating probabilities when some partialinformation is available

• Calculating a desired probability or expectation it is useful to first”‘condition”’ on some appropriate r.v

Definition 9. The redconditional probability mass function is

fX|Y (x|y) = P (X = x|Y = y) =P (X = x, Y = y)

P (Y = y)=fX,Y (x, y)

fY (y)

– Typeset by FoilTEX – 27

Page 29: Intro probability 2

if fY (y) > 0.

Definition 10. For continuous r.vs, the conditional probability densityfunction is

fX|Y (x|y) =fX|Y (x|y)

fY (y)

assuming that fY (y) > 0. Then,

P (X ∈ A|Y = y) =

∫A

fX|Y (x|y)dx.

Example 11. Suppose that X ∼ Unif(0, 1). After obtaining a value ofX we generate Y |X = x ∼ Unif(x, 1). What is the marginal distributionof Y ?

– Typeset by FoilTEX – 28

Page 30: Intro probability 2

Multivariate Distributions and IID Samples

• Let call X(X1, . . . , Xn), where X1, . . . , Xn are r.vs, a random vector. IfX1, . . . , Xn are independent and each has the same marginal distributionwith density f , we say that X1, . . . , Xn are IID (independent andidentically distributed).

• Much of statistical theory and practice begins with IID observations.

– Typeset by FoilTEX – 29

Page 31: Intro probability 2

Transformations of Random Variables

• Suppose that X is a r.v, Y = r(X) be a function of X, i.e. Y = X2 orY = ex. How do we compute the PDF and CDF of Y ?

• In the discrete case

f−Y (y) = P (Y = y) = P (r(X) = y) = P ({x; r(x) = y}) = P (X ∈ r−1(y))

• In the continuous case

1. For each y, find the set Ay = {x : r(x) ≤ y}

– Typeset by FoilTEX – 30

Page 32: Intro probability 2

2. Find the CDF

FY (y) = P (Y ≤ y) = P (r(X) ≤ y) (4)

= P ({x; r(x) ≤ y}) =

∫Ay

fX(x)dx (5)

3. The PDF is fY (y) = F ′Y (y)

Example 12. Let fX(x) = e−x for x > 0. Then FX(x) =∫ x

0fX(s)ds =

1− e−x. Let Y = r(X) = logX. Then Ay = {x : x ≤ ey} and

FY (y) = P (Y ≤ y) = P (logX ≤ y) = P (X ≤ ey) = FX(ey) = 1− e−ey.

– Typeset by FoilTEX – 31

Page 33: Intro probability 2

Therefore, fY (y) = eye−ey

for y ∈ R.�

– Typeset by FoilTEX – 32

Page 34: Intro probability 2

Transformations of Several Random Variables

– Typeset by FoilTEX – 33