STA 291 Summer 2010

18
STA 291 Summer 2010 Lecture 10 Dustin Lueker

description

STA 291 Summer 2010. Lecture 10 Dustin Lueker. z-Scores. The z-score for a value x of a random variable is the number of standard deviations that x is above μ If x is below μ, then the z-score is negative The z-score is used to compare values from different normal distributions - PowerPoint PPT Presentation

Transcript of STA 291 Summer 2010

Page 1: STA 291 Summer 2010

STA 291Summer 2010

Lecture 10Dustin Lueker

Page 2: STA 291 Summer 2010

The z-score for a value x of a random variable is the number of standard deviations that x is above μ◦ If x is below μ, then the z-score is negative

The z-score is used to compare values from different normal distributions

Calculating◦ Need to know

x μ σ

z-Scores

STA 291 Summer 2010 Lecture 10 2

xz

Page 3: STA 291 Summer 2010

z-Scores The z-score is used to compare values from

different normal distributions◦ SAT

μ=500 σ=100

◦ ACT μ=18 σ=6

◦ What is better, 650 on the SAT or 25 on the ACT? Corresponding tail probabilities?

How many percent have worse SAT or ACT scores?

650 500 1.5100

25 18 1.176

SAT

ACT

xz

xz

STA 291 Summer 2010 Lecture 10 3

Page 4: STA 291 Summer 2010

Probability distribution that determines probabilities of the possible values of a sample statistic◦ Example

Sample Mean ( ) Repeatedly taking random samples and

calculating the sample mean each time, the distribution of the sample mean follows a pattern◦ This pattern is the sampling distribution

Sampling Distribution

STA 291 Summer 2010 Lecture 10 4

x

Page 5: STA 291 Summer 2010

If we randomly choose a student from a STA 291 class, then with about 0.5 probability, he/she is majoring in Arts & Sciences or Business & Economics ◦ We can take a random sample and find the sample

proportion of AS/BE students◦ Define a variable X where

X=1 if the student is in AS/BE X=0 otherwise

Example

STA 291 Summer 2010 Lecture 10 5

Page 6: STA 291 Summer 2010

If we take a sample of size n=4, the following 16 samples are possible:

(1,1,1,1); (1,1,1,0); (1,1,0,1); (1,0,1,1);(0,1,1,1); (1,1,0,0); (1,0,1,0); (1,0,0,1);(0,1,1,0); (0,1,0,1); (0,0,1,1); (1,0,0,0);(0,1,0,0); (0,0,1,0); (0,0,0,1); (0,0,0,0)

Each of these 16 samples is equally likely because the probability of being in AS/BE is 50% in this class

Example

STA 291 Summer 2010 Lecture 10 6

Page 7: STA 291 Summer 2010

We want to find the sampling distribution of the statistic “sample proportion of students in AS/BE” ◦ “sample proportion” is a special case of the

“sample mean” The possible sample proportions are 0/4=0, 1/4=0.25, 2/4=0.5, 3/4 =0.75, 4/4=1 How likely are these different proportions?

◦ This is the sampling distribution of the statistic “sample proportion”

Example

STA 291 Summer 2010 Lecture 10 7

Page 8: STA 291 Summer 2010

ExampleSample Proportion of Students

from AS/BEProbability

0.00 1/16=0.0625

0.25 4/16=0.25

0.50 6/16=0.375

0.75 4/16=0.25

1.00 1/16=0.0625

STA 291 Summer 2010 Lecture 10 8

Page 9: STA 291 Summer 2010

ExampleComputer Simulation

Sample Proportion of Students from AS/BE

Relative Frequency

(10 samples of size n=4)

Relative Frequency

(100 samples of size n=4)

Relative Frequency

(1000 samples of size n=4)

0.00 0/10=0.0 8/100=0.08 0.060

0.25 2/10=0.2 26/100=0.26 0.238

0.50 5/10=0.5 31/100=0.31 0.378

0.75 2/10=0.2 28/100=0.28 0.262

1.00 1/10=0.1 7/100=0.07 0.062

STA 291 Summer 2010 Lecture 10 9

Page 10: STA 291 Summer 2010

Sampling Distribution: Example (contd.), Computer Simulation

STA 291 Summer 2010 Lecture 10 10

10 samplesof size n=4

Page 11: STA 291 Summer 2010

Sampling Distribution: Example (contd.), Computer Simulation

STA 291 Summer 2010 Lecture 10 11

100 samplesof size n=4

Page 12: STA 291 Summer 2010

Sampling Distribution: Example (contd.), Computer Simulation

STA 291 Summer 2010 Lecture 10 12

1000 samplesof size n=4

Page 13: STA 291 Summer 2010

Computer Simulations

STA 291 Summer 2010 Lecture 10 13

10 samplesof size n=4

100 samplesof size n=4

1000 samplesof size n=4

Probability Distribution of the Sample Mean

Page 14: STA 291 Summer 2010

A new simulation would lead to different results◦ Reasoning why is same as to why different samples

lead to different results Simulation is merely more samples

However, the more samples we simulate, the closer the relative frequency distribution gets to the probability distribution (sampling distribution)

Note: In the different simulations so far, we have only taken more samples of the same sample size n=4, we have not (yet) changed n.

Computer Simulation

STA 291 Summer 2010 Lecture 10 14

Page 15: STA 291 Summer 2010

The larger the sample size, the smaller the sampling variability

Increasing the sample size to 25…

Reduce Sampling Variability

STA 291 Summer 2010 Lecture 10 15

10 samplesof size n=25

100 samplesof size n=25

1000 samplesof size n=25

Page 16: STA 291 Summer 2010

If you take samples of size n=4, it may happen that nobody in the sample is in AS/BE

If you take larger samples (n=25), it is highly unlikely that nobody in the sample is in AS/BE

The sampling distribution is more concentrated around its mean

The mean of the sampling distribution is the population mean◦ In this case, it is 0.5

Interpretation

STA 291 Summer 2010 Lecture 10 16

Page 17: STA 291 Summer 2010

In practice, you only take one sample The knowledge about the sampling distribution

helps to determine whether the result from the sample is reasonable given the model

For example, our model was P(randomly selected student is in AS/BE)=0.5

◦ If the sample mean is very unreasonable given the model, then the model is probably wrong

Using the Sampling Distribution

STA 291 Summer 2010 Lecture 10 17

Page 18: STA 291 Summer 2010

Effect of Sample Size The larger the sample size n, the

smaller the standard deviation of the sampling distribution for the sample mean◦ Larger sample size = better precision

As the sample size grows, the sampling distribution of the sample mean approaches a normal distribution◦ Usually, for about n=30, the sampling

distribution is close to normal◦ This is called the “Central Limit Theorem”

x n

STA 291 Summer 2010 Lecture 10 18