Download - ECE 302: Lecture 4.7 Gaussian Random Variable

Transcript
Page 1: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

ECE 302: Lecture 4.7 Gaussian Random Variable

Prof Stanley Chan

School of Electrical and Computer EngineeringPurdue University

1 / 22

Page 2: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Outline

Overall schedule:

Continuous random variables, PDF

CDF

Expectation

Mean, mode, medianCommon random variables

UniformExponentialGaussian

Transformation of random variables

How to generate random numbers

Today’s lecture:

Definition of Gaussian

Mean and variance

Skewness and kurtosis

Origin of Gaussian2 / 22

Page 3: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Definition

Definition

Let X be an Gaussian random variable. The PDF of X is

fX (x) =1√

2πσ2e−

(x−µ)2

2σ2 (1)

where (µ, σ2) are parameters of the distribution. We write

X ∼ Gaussian(µ, σ2) or X ∼ N (µ, σ2)

to say that X is drawn from a Gaussian distribution of parameter (µ, σ2).

3 / 22

Page 4: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Interpreting the mean and variance

-10 -5 0 5 10

0

0.1

0.2

0.3

0.4

0.5

= -3

= -0.3

= 0

= 1.2

= 4

-10 -5 0 5 10

0

0.1

0.2

0.3

0.4

0.5

= 0.8

= 1

= 2

= 3

= 4

µ changes, σ = 1 µ = 0, σ changes

Figure: A Gaussian random variable with different µ and σ

4 / 22

Page 5: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Proving the mean

Theorem

If X ∼ N (µ, σ2), then

E[X ] = µ, and Var[X ] = σ2. (2)

E[X ] =1√

2πσ2

∫ ∞−∞

xe−(x−µ)2

2σ2 dx

(a)=

1√2πσ2

∫ ∞−∞

(y + µ)e−y2

2σ2 dy

=

(b)=

(c)= µ.

5 / 22

Page 6: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Proving the variance

Theorem

If X ∼ N (µ, σ2), then

E[X ] = µ, and Var[X ] = σ2. (3)

Var[X ] =1√

2πσ2

∫ ∞−∞

(x − µ)2e−(x−µ)2

2σ2 dx

(a)=

σ2√2π

∫ ∞−∞

y2e−y2

2 dy , by letting y =

=σ2√2π

(−ye−

y2

2

∣∣∣∞−∞

)+

σ2√2π

∫ ∞−∞

e−y2

2 dy

=

= σ2

6 / 22

Page 7: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Standard Gaussian PDF

Definition

A standard Gaussian (or standard Normal) random variable X has a PDF

fX (x) =1√2π

e−x2

2 . (4)

That is, X ∼ N (0, 1) is a Gaussian with µ = 0 and σ2 = 1.

Figure: Definition of the CDF of the standard Gaussian Φ(x).

7 / 22

Page 8: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Standard Gaussian CDF

Definition

The CDF of the standard Gaussian is defined as the Φ(·) function

Φ(x)def= FX (x) =

1√2π

∫ x

−∞e−

t2

2 dt. (5)

The standard Gaussian’s CDF is related to a so-called error functionwhich is defined as

erf(x) =2√π

∫ x

0e−t

2dt. (6)

It is quite easy to link Φ(x) with erf(x):

Φ(x) =1

2

[1 + erf

(x√2

)], and erf(x) = 2Φ(x

√2)− 1.

8 / 22

Page 9: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

CDF of arbitrary Gaussian

Theorem (CDF of an arbitrary Gaussian)

Let X ∼ N (µ, σ2). Then,

FX (x) = Φ

(x − µσ

). (7)

We start by expressing FX (x):

FX (x) = .

Substituting y = t−µσ , and using the definition of standard Gaussian, we

have∫ x

−∞

1√2πσ2

e−(t−µ)2

2σ2 dt =

∫ x−µσ

−∞

1√2π

e−y2

2 dy =

9 / 22

Page 10: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Other results

P[a < X ≤ b] = Φ

(b − µσ

)− Φ

(a− µσ

). (8)

To see this, note that

P[a < X ≤ b] = P[X ≤ b]− P[X ≤ a] = Φ

(b − µσ

)− Φ

(a− µσ

).

Corollary

Let X ∼ N (µ, σ2). Then, the following results hold:

Φ(y) = 1− Φ(−y).

P[X ≥ b] = 1− Φ(b−µσ

).

P[|X | ≥ b] = 1− Φ(b−µσ

)+ Φ

(−b−µ

σ

)10 / 22

Page 11: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Skewness and Kurtosis

Definition

For a random variable X with PDF fX (x), define the following centralmoments as

mean = E[X ]def= µ,

variance = E[(X − µ)2

]def= σ2,

skewness = E

[(X − µσ

)3]

def= γ,

kurtosis = E

[(X − µσ

)4]

def= κ.

11 / 22

Page 12: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Skewness

What is skewness?

E[(

X−µσ

)3].

Measures how asymmetrical the distribution is.Gaussian has skewness 0.

0 5 10 15 200

0.1

0.2

0.3

0.4

positive skewness

symmetric

negative skewness

Figure: Skewness of a distribution measures how asymmetric the distribution is.In this example, the skewness are: orange = 0.8943, black = 0, blue = -1.414.

12 / 22

Page 13: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Kurtosis

What is kurtosis?

κ = E[(

X−µσ

)4].

Measures how heavy tail is. Gaussian has kurtosis 3.Some people prefer excess kurtosis κ− 3. Gaussian has excesskurtosis 0.

-5 -4 -3 -2 -1 0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

kurtosis > 0

kurtosis = 0

kurtosis < 0

Figure: Kurtosis of a distribution measures how heavy tail the distribution is. Inthis example, the (excess) kurtosis are: orange = 2.8567, black = 0, blue =-0.1242. 13 / 22

Page 14: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Skewness and Kurtosis

Random variable Mean Variance Skewness Excess kurtosisµ σ2 γ κ− 3

Bernoulli p p(1− p) 1−2p√p(1−p)

11−p + 1

p − 6

Binomial np np(1− p) 1−2p√np(1−p)

6p2−6p+1np(1−p)

Geometric 1p

1−pp2

2−p√1−p

p2−6p+61−p

Poisson λ λ 1√λ

Uniform a+b2

(b−a)212 0 −6

5Exponential 1

λ1λ2 2 6

Gaussian µ σ2 0 0

Table: The first few moments of commonly used random variables.

14 / 22

Page 15: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Example: Titanic

On April 15, 1912, RMS Titanic sank after hitting an iceberg. This haskilled 1502 out of 2224 passengers and crew. A hundred years later, wewant to analyze the data. On https://www.kaggle.com/c/titanic/

there is a dataset collecting the identities, age, gender, etc of thepassengers.

Statistics Group 1 (Died) Group 2 (Survived)

Mean 30.6262 28.3437Standard Deviation 14.1721 14.9510Skewness 0.5835 0.1795Excess Kurtosis 0.2652 -0.0772

15 / 22

Page 16: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Example: Titanic

Mean and standard deviation cannot tell the difference.

Skewness and kurtosis can tell the difference.

0 20 40 60 80

age

0

10

20

30

40

0 20 40 60 80

age

0

10

20

30

40

Group 1 (died) Group 2 (survived)

Figure: The Titanic dataset https://www.kaggle.com/c/titanic/.

16 / 22

Page 17: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Origin of Gaussian

Where does Gaussian come from?

Why are they so popular?

Why do they have bell shapes?

What is the origin of Gaussian?

When we sum many independent random variables, the resultingrandom variable is a Gaussian.This is known as the Central Limit Theorem. The theorem appliesto any random variable.Summing random variables is equivalent to convolving the PDFs.Convolving PDFs infinitely many times yields the bell shape.

17 / 22

Page 18: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

The experiment of throwing many dices

(a) X1 (b) X1 + X2

(c) X1 + . . .+ X5 (d) X1 + . . .+ X100

Figure: When adding uniform random variables, the overall distribution isbecoming like a Gaussian.

18 / 22

Page 19: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Sum of X and Y = Convolution of fX and fY

Example: Two rectangles to give a triangle:

We will show this result in a later lecture:

(fX ∗ fX )(x) =

∫ ∞−∞

fX (τ)fX (x − τ)dτ.

19 / 22

Page 20: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

If you convolve infinitely many times

Then in Fourier domain you will have

F {(fX ∗ fX ∗ . . . ∗ fX )} = F{fX} · F{fX} · . . . · F{fX}.

-10 -8 -6 -4 -2 0 2 4 6 8 10-0.5

-0.25

0

0.25

0.5

0.75

1

1.25

(sin x)/x

(sin x)2/x

2

(sin x)3/x

3

Figure: Convolving the PDF of a uniform distribution is equivalent to multiplyingtheir Fourier transforms in the Fourier space. As the number of convolutiongrows, the product is gradually becoming Gaussian.

20 / 22

Page 21: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Origin of Gaussian

What happens if you convolve a PDF infinitely many times?

You will get a Gaussian.This is known as the central limit theorem.

Why are Gaussians everywhere?

We seldom look at individual random variables. We often look atthe sum/average.Whenever we have a sum, Central Limit Theorem kicks in.Summing random variables is equivalent to convolving the PDFs.Convolving PDFs infinitely many times yields the bell shape.This result applies to any random variable, as long as they areindependently summed.

21 / 22

Page 22: ECE 302: Lecture 4.7 Gaussian Random Variable

c©Stanley Chan 2020. All Rights Reserved.

Questions?

22 / 22