Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal...

10

Click here to load reader

Transcript of Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal...

Page 1: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

Normal Distribution Lecture Notes

Professor Richard Blecksmith

[email protected]

Dept. of Mathematical Sciences

Northern Illinois University

Math 101 Website: http://math.niu.edu/∼richard/Math101

Section 2 Website: http://math.niu.edu/∼richard/Math101/fall06

1. Normal Distribution Curve

µ µ+ σ µ+ 2σµ− σµ− 2σ

34%34%

13.5%13.5%2.5%2.5%

In a normal distribution

• Fact 1. Center = mean = median• Fact 2. The data lies equally distributed on each side of the center.

– 50% of the data lies to the left of µ and– 50% of the data lies to the right of µ.

2. The 68 – 95 – 99 Rule

• Fact 3.– 68% of the data lies within 1 standard deviation of the mean– 95% of the data lies within 2 standard deviations of the mean– 99% of the data lies within 3 standard deviations of the mean

1

Page 2: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

2

3. Standardizing Data

Given normally distributed data, with mean µ and starndard deviation σ.

If x is a data point, we wish to know:

• how many standard deviations is x to the right (or left) of the center?• That is, x = µ+ z · σ. Solve for z.

µ+ z · σ = x

z · σ = x− µz = (x− µ)/σ

4. The z–Rule

OriginalData Value

StandardizedData Value

x z = (x− µ)/σ

• A negative value of z represents a data point to the left of the center• A positive value of z represents a data point to the right of center

5. Example from Text (page 51)

The lifetime of 20,000 flashlight batteries are normally distributed, with amean of µ = 370 days and a standard deviation of σ = 30 days.

1. What percentage of the batteries are expected to last more than 340days?

Solution: z = (x− µ)/σ

= (340− 370)/30 = −1.00

• Look up z = 1 in the chart.• (The negative means that this value occurs one standard deviation to

the left of the center µ.)• The corresponding P value is 34.1%.

Page 3: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

3

6. Draw the picture

µµ− 1.00σ

5034.1

The answer is 34.1 + 50 = 84.1%.

7. Question 2

2. How many batteries can be expected to last less than 325 days?

Solution: Work with percentages.

• z = (x− µ)/σ = (325− 370)/30 = −1.50• Look up z = 1.5 in the chart.• The corresponding P value is 43.3%.

8. Draw the picture

µµ− 1.50σ

43.3

• Fifty percent of the data lies to the left of the center.• Since 43.3% lies between µ− 1.50σ and the center µ,• the percentage to the left of µ− 1.50σ is 50.0− 43.3 = 6.7%

The final answer is: 6.7 percent of 20,000 = .067× 20, 000 = 1340

9. SAT Example

• In 2001 a total of 1,276,320 college-bound students took the SAT exam.

Page 4: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

4

• The mean and standard deviation of the test scores was µ = 506 andσ = 111.• 68% of the students fall within 1 standard deviation of the mean,• that is in the range µ−σ = 506−111 = 395 to µ+σ = 506+111 = 617.• 95% of the students fall within 2 standard deviations of the mean, that

is in the range µ− 2σ = 506− 222 = 284 to µ+ 2σ = 506 + 222 = 728.• Where is the cutoff between the first and second Quartile?

10. SAT Example Cont’d

• We want P = 25%.• The (3-digit) chart shows the z-value corresponding to P = .25 is z =.675.• This means that 25% of the data occurs before you get within .675

standard deviations of µ (on the left).• Another 25% lies between µ− .675σ and µ itself.• So the first quartile occurs at• Q1 = µ− .675σ = 506− (.675)111 = 431• It turns out Q1 was exactly 430.• The third quartile occurs at• Q1 = µ+ .675σ = 506 + (.675)111 = 581

11. Draw the Picture

2001 SAT Scores

µ

506

µ− 0.675σ

Q1 = 431

25%25%

µ+ 0.675σ

Q3 = 581

25%25%

I.4 Sampling Lecture Notes

Page 5: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

5

12. Statistical Thinking

Statistical thinking will one day be as necessary for efficient cit-izenship as the ability to read and write. – H. G. Wells, authorof “War of the Worlds”

Definition: Statistics is the science of collecting, analyzing, and interpretingdata in such a way that the conclusions can be objectively evaluated.

13. Three Phases of Statistics

• Collect the data• Analyze the data

– order the data– graphical displays– numerical calculations (such as mean and standard dev)

• Interpret the results– use proper statistical techniques to substantiate or refute hypothe-

sized statements– match data to the appropriate technique– determine whether the proper assumptions are satisfied

14. Two types of statistics

• Descriptive statistics – summarize and describe a characteristic forsome group• Inferential statistics – estimate, infer, predict, or conclude something

about a larger group

15. Examples

Descriptive InferentialBatting Average PollsYards Per Carry Medical StudiesTest Scores Market Surveys

Page 6: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

6

16. Two types of data

• Quantitative data – values recorded on a natural numerical scale• Qualitative data – classified into categories

17. Quantitative Data

• Weight of subjects in medical sample• Height of buildings in Chicago• Temperatures per day at Antarctica Weather Station

18. Qualitative Data

• Gender of subjects in medical sample• Political affilation of respondents in a poll survey• Class (fresh, soph, jr, sr) of Math 101 students

19. Vocabulary

• The population is the entire set of objects (people or things) underconsideration.• A sample is a subset of the population that is available for the analysis.• A bias is a favoring of certain outcomes over others.• A census collects data from each member of the population.• A statistic is a statement of numerical information about a sample.• A parameter is a statement of numerical information about a popula-

tion.

20. Census versus Sample

Would you use a census or a sample to determine the following:

• Project the winner of an election

• Calculate a baseball player’s batting average

Page 7: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

7

• Predict whether it will rain tomorrow

• Test whether the soup is too salty

• Calculate Shaq’s free throw average

• Use a market study to determine a new flavor of toothpaste

• Report the Dow Jones Average

• Generalize a medical study to other groups

• The average score on the first test

21. Dealing with bias

Bias in some form occurs in the collecting of most, if not all, sets of data.

The bias may come from

• the portion of the population surveyed• the phrasing of the questions

22. Examples

• “Dewey defeats Truman” projection of Chicago Tribune based on 1948telephone poll

• “Are you in favor of Illinois banning cell phones in cars? Dial *91 onyour cellular phone to vote.”

• “Do you feel budget cuts are more important than humanitarian pro-grams that would need to be cut to obtain a balanced budget?”

23. Methods for Choosing Samples

• Judgement Sample

Page 8: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

8

– Use the opinion of person(s) deemed qualified to choose membersof the sample.

– Example: to investigate study habits of atheletes, ask their coachesand teachers.

• Simple Random Selection– Use random numbers to select the sample.– Page 315 Random Digit Table:

72985547555515086461

• Stratefied Sampling– Divide the population into relatively homogenous groups, draw a

sample from each group, and take their union.

24. Goals of a good sample

• from the correct population• chosen in an unbiased way• large enough to reflect total population

25. Normal Distribution of Random Events

Toss a coin 100 times and count the number of heads.

How many heads would you expect?

• about 50• exactly 50

It does not seem reasonable that the count will be exactly 50.

We would not be surprised if the number of heads turned out to be 48 or51 or even 55.

We would be surprised to see 80 heads, and would begin to suspect that thecoin was not fair.

26. Coin Toss Data

Experiment: A coin is tossed n = 100 times.

Page 9: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

9

The experiment is repeated 1000 times.

Here are the results:

27. Frequency Table: No. of Heads

Heads Freq Heads Freq Heads Freq

1 0 45 54 58 27... 0 46 49 59 19

34 0 47 54 60 1135 2 48 66 61 1136 2 49 89 62 537 2 50 70 63 438 2 51 77 64 239 5 52 85 65 040 14 53 62 66 041 16 54 57 67 142 25 55 52 68 043 30 56 40

... 044 31 57 36 100 0

28. Mean and Standard Deviation

mean = 50.296

stand dev = 5.100

Page 10: Normal Distribution Lecture Notes - NIUmath.niu.edu/~richard/Math101/fall06/stats2_ho.pdf · Normal Distribution Lecture Notes ... If we could examine all possible samples of size

10

29. Coin Toss Histogram

30 40 50 60 70

30. Sampling Distributions

If we could examine all possible samples of size n of a population, then thefrequency distribution of the means of these samples is normally distributed.

• µ = the mean over the entire population• σ = the standard deviation over the entire population• x = the mean of the sampling distribution• σx = the standard deviation of the sampling distribution

31. Two Rules

Rule 1. x = µ

Rule 2. σx =σ√n

We are assuming in Rule 2 that the size of the entire population is muchlarger than the sample size n.