FEC 512.04

54
Introduction to Sampling Distributions and Estimating Population Values Istanbul Bilgi University FEC 512 Financial Econometrics-I Dr. Orhan Erdem

description

 

Transcript of FEC 512.04

Page 1: FEC 512.04

Introduction to

Sampling Distributions and

Estimating Population Values

Istanbul Bilgi University

FEC 512 Financial Econometrics-I

Dr. Orhan Erdem

Page 2: FEC 512.04

FEC 512 Lecture 4- 2

Unbiasedness

� A point estimator is said to be an

unbiased estimator of the parameter θθθθ if

the expected value, or mean, of the

sampling distribution of is θθθθ,

� Examples:

� The sample mean is an unbiased estimator of µ

� The sample variance is an unbiased estimator of σ2

θ̂

θ̂

θ)θE( =ˆ

Page 3: FEC 512.04

FEC 512 Lecture 4- 3

� is an unbiased estimator, is biased:

1θ̂ 2θ̂

θ̂θ

1θ̂ 2θ̂

Unbiasedness(continued)

Page 4: FEC 512.04

FEC 512 Lecture 4- 4

Bias

� Let be an estimator of θθθθ

� The bias in is defined as the difference

between its mean and θθθθ

� The bias of an unbiased estimator is 0

θ̂

θ̂

θ)θE()θBias( −= ˆˆ

Page 5: FEC 512.04

FEC 512 Lecture 4- 5

Consistency

� Let be an estimator of θθθθ

� is a consistent estimator of θθθθ if the

difference between the expected value of

and θθθθ decreases as the sample size

increases

� Consistency is desired when unbiased

estimators cannot be obtained

θ̂

θ̂θ̂

Page 6: FEC 512.04

FEC 512 Lecture 4- 6

Most Efficient Estimator

� Suppose there are several unbiased estimators of θ� The most efficient estimator or the minimum variance

unbiased estimator of θ is the unbiased estimator with the smallest variance

� Let and be two unbiased estimators of θ, based on the same number of sample observations. Then,

� is said to be more efficient than if

)θVar()θVar( 21ˆˆ <

1θ̂ 2θ̂

1θ̂ 2θ̂

Page 7: FEC 512.04

FEC 512 Lecture 4- 7

Sampling Distribution

� A sampling distribution is a

distribution of the possible values

of a statistic for a given size

sample selected from a

population

Page 8: FEC 512.04

FEC 512 Lecture 4- 8

Developing a Sampling Distribution

� Assume there is a population …

� Population size N=4

� Random variable, X,

is age of individuals

� Values of X:

18, 20, 22, 24 (years)

A B CD

Page 9: FEC 512.04

FEC 512 Lecture 4- 9

.25

018 20 22 24

A B C D

Uniform Distribution

P(x)

x

(continued)

Summary Measures for the Population Distribution:

Developing a Sampling Distribution

214

24222018

N

Xµ i

=+++

=

= ∑

2.236N

µ)(Xσ

2

i =−

= ∑

Page 10: FEC 512.04

FEC 512 Lecture 4- 10

1st

2nd

Observation Obs 18 20 22 24

18 18,18 18,20 18,22 18,24

20 20,18 20,20 20,22 20,24

22 22,18 22,20 22,22 22,24

24 24,18 24,20 24,22 24,24

16 possible samples

(sampling with

replacement)

Now consider all possible samples of size n = 2

1st 2nd Observation

Obs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

(continued)

16 Sample

Means

Developing a Sampling Distribution

Page 11: FEC 512.04

FEC 512 Lecture 4- 11

1st 2nd Observation

Obs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

Sampling Distribution of All Sample Means

18 19 20 21 22 23 240

.1

.2

.3

P(X)

X

Sample Means

Distribution16 Sample Means

_

(continued)

(no longer uniform)

_

Developing a Sampling Distribution

Page 12: FEC 512.04

FEC 512 Lecture 4- 12

Summary Measures of this Sampling Distribution:

Developing aSampling Distribution

(continued)

µ2116

24211918

N

X)XE( i ==

++++== ∑ L

1.5816

21)-(2421)-(1921)-(18

N

µ)X(σ

222

2i

X

=+++

=

−= ∑

L

Page 13: FEC 512.04

FEC 512 Lecture 4- 13

Comparing the Population with its Sampling Distribution

18 19 20 21 22 23 240

.1

.2

.3 P(X)

X18 20 22 24

A B C D

0

.1

.2

.3

Population

N = 4

P(X)

X _

1.58σ 21µXX

==2.236σ 21µ ==

Sample Means Distributionn = 2

_

Page 14: FEC 512.04

FEC 512 Lecture 4- 14

Sampling in Excel

� Tools/Data Analysis/Sampling

Page 15: FEC 512.04

FEC 512 Lecture 4- 15

Histogram of 500 Sample Means from

Sample Size n=10

Mean of the Sample Means is 2.41 with

0.421 St.Dev. 477.010

507.1

n

σ ==where

Page 16: FEC 512.04

FEC 512 Lecture 4- 16

Mean of the Sample Means is 2.53 with

0.376 St.Dev. 337.020

507.1

n

σ ==where

Page 17: FEC 512.04

FEC 512 Lecture 4- 17

� For any population,

� the average value of all possible sample means

computed from all possible random samples of a given

size from the population is equal to the population mean:

� The standard deviation of the possible sample means

computed from all random samples of size n is equal to

the population standard deviation divided by the square

root of the sample size:

Properties of a Sampling Distribution

µµx =

n

σσx =

Theorem 1

Theorem 2

Page 18: FEC 512.04

FEC 512 Lecture 4- 18

If the Population is Normal

If a population is normal with mean µ and

standard deviation σ, the sampling distribution

of is also normally distributed with

and

x

µµx =n

σσx =

Theorem 3

Page 19: FEC 512.04

FEC 512 Lecture 4- 19

z-value for Sampling Distributionof x

� Z-value for the sampling distribution of :

where: = sample mean

= population mean

= population standard deviation

n = sample size

σ

n

σ

µ)x(z

−=

x

Page 20: FEC 512.04

FEC 512 Lecture 4- 20

Normal Population

Distribution

Normal Sampling

Distribution

(has the same mean)

Sampling Distribution Properties

� The sample mean is an unbiased estimator

x

x

µµx =

µ

Page 21: FEC 512.04

FEC 512 Lecture 4- 21

� The sample mean is a consistent estimator

(the value of x becomes closer to µ as n

increases):

Sampling Distribution Properties

Larger

sample size

Small

sample size

x

(continued)

nσ/σx =

µ

x

x

Population

As n increases,

decreases

Page 22: FEC 512.04

FEC 512 Lecture 4- 22

If the Population is not Normal

� We can apply the Central Limit Theorem:

� Even if the population is not normal,

� …sample means from the population will beapproximately normal as long as the sample size is large enough

� …and the sampling distribution will have

and

µµx =n

σσx =

Theorem 4

Page 23: FEC 512.04

FEC 512 Lecture 4- 23

n↑

Central Limit Theorem

As the

sample

size gets

large

enough…

the sampling

distribution

becomes

almost normal

regardless of

shape of

population

x

Page 24: FEC 512.04

FEC 512 Lecture 4- 24

Population Distribution

Sampling Distribution

(becomes normal as n increases)

Central Tendency

Variation

(Sampling with

replacement)

x

x

Larger

sample

size

Smaller

sample size

If the Population is not Normal(continued)

Sampling distribution

properties:

µµx =

n

σσx =

µ

Page 25: FEC 512.04

FEC 512 Lecture 4- 25

How Large is Large Enough?

� For most distributions, n > 25 will give a

sampling distribution that is nearly normal

� For fairly symmetric distributions, n > 15 is

sufficient

� For normal population distributions, the

sampling distribution of the mean is always

normally distributed

Page 26: FEC 512.04

FEC 512 Lecture 4- 26

Example

� Suppose a population has mean µ = 8 and

standard deviation σ = 3. Suppose a

random sample of size n = 36 is selected.

� What is the probability that the sample mean

is between 7.8 and 8.2?

Page 27: FEC 512.04

FEC 512 Lecture 4- 27

Example

Solution:

� Even if the population is not normally

distributed, the central limit theorem can be

used (n > 30)

� … so the sampling distribution of is

approximately normal

� … with mean = µ = 8

� …and standard deviation

(continued)

x

0.536

3

n

σσ x ===

Page 28: FEC 512.04

FEC 512 Lecture 4- 28

Example

Solution (continued) -- find z-scores:(continued)

0.31080.4)zP(-0.4

363

8-8.2

µ- µ

363

8-7.8P 8.2) µ P(7.8

x

x

=<<=

<<=<<

z7.8 8.2 -0.4 0.4

Sampling

Distribution

Standard Normal

Distribution.1554

+.1554

Population

Distribution

??

??

?????

???Sample Standardize

8µ = 8µx

= 0µz =x x

Page 29: FEC 512.04

FEC 512 Lecture 4- 29

� Suppose that Y1, Y2 Y3... Yn are i.i.d., and let µx and

σx2 denote the mean and the variance of Yi.

1

1

2 21 1 1,

2

1( ) ( )

1( ) ( )

1 1( ) ( )

n

i Y

i

n

i

i

n n n

i i j

i i j j i

Y

E Y E Yn

V a r Y V a r Yn

V a r Y C ov Y Yn n

n

µ

σ

=

=

= = = ≠

= =

=

= +

=

∑ ∑ ∑

Page 30: FEC 512.04

FEC 512 Lecture 4- 30

Point and Interval Estimates

� A point estimate is a single number,

� a confidence interval provides additional information about variability

Point Estimate

Lower

Confidence

Limit

Upper

Confidence

Limit

Width of confidence interval

Page 31: FEC 512.04

FEC 512 Lecture 4- 31

Confidence Intervals

� How much uncertainty is associated with a point estimate of a population parameter?

� An interval estimate provides more information about a population characteristic than does a point estimate

� Such interval estimates are called confidence intervals

� Never 100% sure:

“The surer we want to be, the less we have to be sure of” -Freund and Williams(1977)-

Page 32: FEC 512.04

FEC 512 Lecture 4- 32

Estimation Process

(mean, µ, is

unknown)

Population

Random Sample

Mean

x = 50

Sample

I am 95%

confident that

µ is between

40 & 60.

Page 33: FEC 512.04

FEC 512 Lecture 4- 33

General Formula

� The general formula for all

confidence intervals is:

Point Estimate ±±±± (Critical Value)(Standard Error)

Page 34: FEC 512.04

FEC 512 Lecture 4- 34

Confidence Level, (1-α)

� Suppose confidence level = 95%

� Also written (1 - α) = .95

� A relative frequency interpretation:

� In the long run, 95% of all the confidence

intervals that can be constructed will contain

the unknown true parameter

� A specific interval either will contain or

will not contain the true parameter

� No probability involved in a specific interval

(continued)

Page 35: FEC 512.04

FEC 512 Lecture 4- 35

Confidence Interval for µ(σ Known)

� Assumptions

� Population standard deviation σ is known

� Population is normally distributed

� If population is not normal, use large

sample

� Confidence interval estimate

n

σzx α/2±

Page 36: FEC 512.04

FEC 512 Lecture 4- 36

Finding the Critical Value

� Consider a 95% confidence interval:

z.025= -1.96 z.025= 1.96

.951 =α−

.0252

α= .025

2

α=

Point EstimateLower Confidence Limit

UpperConfidence Limit

z units:

x units: Point Estimate

0

1.96zα/2 ±=

Page 37: FEC 512.04

FEC 512 Lecture 4- 37

Common Levels of Confidence

� Commonly used confidence levels are

90%, 95%, and 99%

Confidence

Level

Confidence

Coefficient,z value,

1.28

1.645

1.96

2.33

2.57

3.08

3.27

.80

.90

.95

.98

.99

.998

.999

80%

90%

95%

98%

99%

99.8%

99.9%

α−1 /2zα

Page 38: FEC 512.04

FEC 512 Lecture 4- 38

µµx

=

Interval and Level of Confidence

Confidence Intervals

Intervals extend from

to

100(1-α)%of intervals constructed contain µ;

100α% do not.

Sampling Distribution of the Mean

n

σzx /2α+

n

σzx /2α−

x

x1

x2

/2α /2αα−1

Page 39: FEC 512.04

FEC 512 Lecture 4- 39

Margin of Error

� Margin of Error (e): the amount added and

subtracted to the point estimate to form the

confidence interval

n

σzx /2α±

n

σze /2α=

Example: Margin of error for estimating µ, σ known:

Page 40: FEC 512.04

FEC 512 Lecture 4- 40

Factors Affecting Margin of Error

� Data variation, σ : e as σ

� Sample size, n : e as n

� Level of confidence, 1 - α : e if 1 - α

n

σze /2α=

Page 41: FEC 512.04

FEC 512 Lecture 4- 41

� If the population standard deviation σ

is unknown, we can substitute the

sample standard deviation, s

� This introduces extra uncertainty,

since s is variable from sample to

sample

� So we use the t distribution instead of

the normal distribution

Confidence Interval for µ(σ Unknown)

Page 42: FEC 512.04

FEC 512 Lecture 4- 42

� Assumptions

� Population standard deviation is unknown

� Population is normally distributed

� If population is not normal, use large sample (if

you have a large sample you can still use z dist.)

� Use Student’s t Distribution

� Confidence Interval Estimate

Confidence Interval for µ(σ Unknown)

n

stx /2α±

(continued)

Page 43: FEC 512.04

FEC 512 Lecture 4- 43

Student’s t Distribution

� The t is a family of distributions

� The t value depends on degrees of

freedom (d.f.)

� Number of observations that are free to vary after

sample mean has been calculated

d.f. = n - 1

Page 44: FEC 512.04

FEC 512 Lecture 4- 44

If the mean of these three

values is 8.0,

then x3 must be 9

(i.e., x3 is not free to vary)

Degrees of Freedom (df)

Idea: Number of observations that are free to varyafter sample mean has been calculated

Example: Suppose the mean of 3 numbers is 8.0

Let x1 = 7

Let x2 = 8

What is x3?

Here, n = 3, so degrees of freedom = n -1 = 3 – 1 = 2

(2 values can be any numbers, but the third is not free to vary

for a given mean)

Page 45: FEC 512.04

FEC 512 Lecture 4- 45

Student’s t Distribution

t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard

Normal(t with df = ∞)

Note: t z as n increases

Page 46: FEC 512.04

FEC 512 Lecture 4- 46

Student’s t Table

Upper Tail Area

df .25 .10 .05

1 1.000 3.078 6.314

2 0.817 1.886 2.920

3 0.765 1.638 2.353

t0 2.920

The body of the table

contains t values, not

probabilities

Let: n = 3

df = n - 1 = 2

α = .10

α/2 =.05

α/2 = .05

Page 47: FEC 512.04

FEC 512 Lecture 4- 47

t distribution values

With comparison to the z value

Confidence t t t z

Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.28

.90 1.812 1.725 1.697 1.64

.95 2.228 2.086 2.042 1.96

.99 3.169 2.845 2.750 2.57

Note: t z as n increases

Page 48: FEC 512.04

FEC 512 Lecture 4- 48

Example

A random sample of n = 25 has x = 50 and s = 8. Form a 95% confidence interval for µ

� d.f. = n – 1 = 24, so

The confidence interval is

2.0639tt 24,.025α/21,n ==−

53.302µ46.698

25

8(2.0639)50µ

25

8(2.0639)50

n

Stxµ

n

Stx α/21,-n α/21,-n

<<

+<<−

+<<−

Page 49: FEC 512.04

FEC 512 Lecture 4- 49

Example

� A money manager wants to obtain a 95% CI

for fund inflows and outflows over the future.

He calls a random sample of 10 clients

enquiring about their planned additions to

and withdrawals from the fund. He computes

that there will be an average of 5.5m cash

inflows with 10m standard deviation. A

histogram of past data looks fairly normal.

Calculate a 95% CI for the population mean.

Page 50: FEC 512.04

FEC 512 Lecture 4- 50

Solution

� The CI for the population means spans -1.65m to

12.65m. The manager can be confident at the 95%

level that this range includes the population mean

%15.75.510

10262.25.5025.0 ±=±=±

n

stx

Page 51: FEC 512.04

FEC 512 Lecture 4- 51

Approximation for Large Samples

� Since t approaches z as the sample size

increases, an approximation is sometimes

used when n ≥ 30:

n

stx /2α±

n

szx /2α±

Technically correct

Approximation for large n

Page 52: FEC 512.04

FEC 512 Lecture 4- 52

Example: Sharpe Ratio

� Suppose an investment advisor takes a

random sample of stock funds and

calculates the average Sharpe

ratio(excess return/st.dev). The sample

size is 100, and has a standard deviation

of 0.30. If the average Sharpe ratio is

0.45, determine a 90% confidence

interval for the true population mean of

Sharpe ratio.

Page 53: FEC 512.04

FEC 512 Lecture 4- 53

Solution:

(continued)

499.0:401.0

03.0*645.145.0100

30.0645.145.0025.0

=

±=±=±n

szx

Page 54: FEC 512.04

FEC 512 Lecture 4- 54

Value At Risk