Two statistical inference methods: Confidence interval Estimator +/- Margin of Error Hypothesis...

82
Two statistical inference methods: Confidence interval Estimator +/- Margin of Error Hypothesis testing Hypothesis: H 0 v.s. H a Test statistic P-value Conclusion Review for Final Exam

Transcript of Two statistical inference methods: Confidence interval Estimator +/- Margin of Error Hypothesis...

• Two statistical inference methods:Confidence interval

Estimator +/- Margin of ErrorHypothesis testing

Hypothesis: H0 v.s. Ha

Test statisticP-valueConclusion

Review for Final Exam

Review for Final Exam

Question

μ? p?

C.I.? C.I.? Test?Test?

1-S?1-S? 1-S? 1-S? 2-S?2-S?2-S?2-S?

• Inference about population proportion pConfidence interval:

A level C confidence interval for p is given by

where z* is a z-critical value corresponding to the confidence level C, n is the sample size, and p is the sample proportion.

n

ppzp

)ˆ1(ˆˆ *

^

StandardError

Review for Final Exam

• Inference about population proportion pThe level C confidence interval for a

population proportion p will have margin of error approximately equal to a specified value m when the sample size is

where p* is a guessed value for the sample proportion. The margin of error will be at most m if p* is taken to be 0.5.

)1( **

2*

ppm

zn

Review for Final Exam

• Inference about population proportion pHypothesis testing

Hypotheses:H0:p=p0 v.s. Ha:p>p0/p<p0/p≠p0

Test Statistic:

n

pp

ppz

)1(

ˆ

00

0

Review for Final Exam

• Inference about population proportion p (continued):Hypothesis testing:

P-value:P-value=1-Φ(z), for Ha:p>p0

P-value=Φ(z), for Ha:p<p0

P-value=2(1-Φ(|z|)), for Ha:p≠p0

Here z is the value of the test statistic and Φ(z) is the probability from the normal table corresponding to z.

Conclusion:Reject H0 if P-value<αDo not reject H0 if P-value>α

Review for Final Exam

Review for Final Exam

• Inference about population mean μConfidence interval:

A level C confidence interval for μ is given by

where t* is the t-critical value corresponding to degrees of freedom n-1 and the confidence level C, n is the sample size, s is the sample standard deviation, and x is the sample mean.

n

stx *

_

Standard Error

Review for Final Exam

• Inference about population mean μHypothesis testing:

Hypotheses:H0:μ=μ0 v.s. Ha:μ>μ0/μ<μ0/μ≠μ0

Test Statistic:

The test statistic follows a t-distribution with degrees of freedom n-1.

nsx

t 0

Review for Final Exam

• Inference about population mean μHypothesis testing:

P-value:P-value=Tdf(t), for Ha:μ>μ0

P-value=Tdf(-t), for Ha:μ<μ0

P-value=2Tdf(|t|), for Ha:μ≠μ0

Here Tdf(t) means look up the t-Critical Values Table for the test statistic t.

Conclusion:Reject H0 if P-value<α

Do not reject H0 if P-value>α

Review for Final Exam

Review for Final Exam

• Interpretation about hypothesis testingP-value is the probability, assuming the null

hypothesis is true, that the test statistic will take a value as extreme or more extreme (meaning favoring the alternative hypothesis Ha) than that actually observed.Caution: P-value is NOT the probability

that the null hypothesis is wrong.

Review for Final Exam

• Interpretation about hypothesis testingType I error: reject H0 while is H0 true

Type II error: do not reject H0 while is H0 falseThe significance level α is our tolerance for

the probability of making type I error.The P-value is the probability of making type

I error when we reject the null hypothesis based on our sample.

If the consequences of rejecting the null hypothesis are very serious, we want to be conservative at rejecting H0. Therefore, we should choose a small α.

Review for Final Exam

• In a survey conducted by a firm, 12 of 60 families in two story houses were found to own their houses. Let p denote the population proportion of families of two story houses who own their house.Find a 95% confidence interval for p.The firm came up with a confidence interval

(0.1406, 0.2594) for p. What confidence level did the firm use?

Assume nothing is known about p. The firm requires a 95% confidence interval with margin of error at most 0.034 for p. What is the required sample size?

Suppose that a previous survey indicates that the p is 0.28. The firm requires a 95% confidence interval with margin of error at most 0.034 for p. What is the required sample size?

Review for Final Exam – Practice

• Solution:Find a 95% confidence interval (C.I.) for p.

In general, a level C C.I. for p is given by

In this case, p=12/60=0.2;n=60;z*=1.96 (according to the 95% confidence level)

Thus a 95% C.I. for p is

)3012.0 ,0988.0(60

)2.01(2.096.12.0 ,

60

)2.01(2.096.12.0

n

ppzp

n

ppzp

)ˆ1(ˆˆ ,

)ˆ1(ˆˆ **

^

Review for Final Exam – Practice

• Solution:The firm came up with a confidence interval

(0.1406, 0.2594) for p. What confidence level did the firm use?Confidence interval for p can also be given

by

where ME is the margin of error:In this case,

ME=0.2594-0.2=0.0594The standard error is

Then z*=ME/SE=0.0594/0.0516=1.15, which corresponds to confidence level 75%.

MEˆ p

0516.060

)2.01(2.0)ˆ1(ˆSE

n

pp

n

ppz

)ˆ1(ˆME *

Review for Final Exam – Practice

• Solution:Assume nothing is known about p. The firm

requires a 95% C.I. with margin of error at most 0.034 for p. What is the required sample size? The required sample size for a level C

(corresponding to z*) C.I. for a p with margin of error approximately equal to m is

In this case:z*=1.96, p*=0.5, m=0.034

Then .8318.830)5.01(5.0

034.0

96.12

n

)1( **

2*

ppm

zn

Review for Final Exam – Practice

• Solution:Suppose that a previous survey indicates

that the p is 0.28. The firm requires a 95% C.I. with margin of error at most 0.034 for p. What is the required sample size? The required sample size for a level C

(corresponding to z*) C.I. for a p with margin of error approximately equal to m is

In this case:z*=1.96, p*=0.28, m=0.034

Then .67095.669)28.01(28.0

034.0

96.12

n

)1( **

2*

ppm

zn

Review for Final Exam – Practice

• To target the right age-group of people, a marketing consultant must find which age-group purchases from home-shopping channels on TVs more frequently. According to management of TeleSell24/7, a home-shopping store on TV, about 40% of the online-music-downloaders are in their fifties, but the marketing consultant does not believe in that figure. To test this he selects a random sample of 205 online-music-downloaders and finds 71 of them are in their fifties. What are the hypotheses in this case?What is the value of the test statistic?What is the P-value of the test?What is your conclusion at α=5%?

Review for Final Exam – Practice

• Solution:The sample:

What are the hypotheses in this case?H0:p=0.4 v.s. Ha:p≠0.4

What is the value of the test statistic?

58.1

205)40.01(40.0

40.0346.0

)1(

ˆ

00

0

npp

ppz

205,346.0205

71ˆ np

Review for Final Exam – Practice

• Solution:What is the P-value of the test?

According to Ha:p≠0.4, P-value=2(1-Φ(|-1.58|))=0.1141.

What is your conclusion?Since P-value>α(=5%), we do not reject

the null hypothesis.If we concluded that 40% of the online-

music-downloaders are in their fifties while in fact this proportion is 35%, then we made a Type I Error.we made a Type II Error.we made a correct decision.

Review for Final Exam – Practice

• The safety management of an offshore oil-mining corporation believes that the true average escape time would be at most 340 min. A sample of 28 offshore oil-workers took part in a simulated escape exercise. The sample yielded an average escape time of 347.68 min. and standard deviation of 26.95 min. Does this data contradict the management's claim? What are the hypotheses in this case?What is the value of the test statistic?What is the P-value of the test?What is your conclusion at α=5%?What is a 98% confidence interval of the

average escape time?

Review for Final Exam – Practice

• Solution:The sample:

What are the hypotheses in this case?H0:μ=340 v.s. Ha:μ>340

What is the value of the test statistic?

The test statistic follows a t-distribution with degrees of freedom 28-1=27.

.28,95.26,68.347 nsx

508.1

2895.26

34068.3470

nsx

t

Review for Final Exam – Practice

• Solution:What is the P-value of the test?

According to Ha:μ>340, P-value is between 0.05 and 0.10.

Review for Final Exam – Practice

• Solution:What is your conclusion?

Since P-value>α(=5%?), we do not reject the null hypothesis.

If we concluded that the management's claim is correct while in fact average escape time is 340 min., then we made a Type I Error.we made a Type II Error.we made a correct decision.

Review for Final Exam – Practice

• Solution:What is a 98% confidence interval of the

average escape time?A level C confidence interval for μ is

given by

We havet*=2.473 (corresponding to degrees of freedom 27 and the confidence level 98%);n=28, s=26.95, and x=347.68.

So a 98% confidence interval of the average escape time is

).2752.360,0848.335(28

95.26473.268.347

n

stx *

_

Review for Final Exam – Practice

Review for Final Exam – Practice

• In a test of hypothesis, if we insist on very strong evidence against the null hypothesis we shouldchoose α to be very smallchoose α to be larger than the P-valuechoose α to be very largechoose α to be smaller than the P-value

Review for Final Exam – Practice

• Based on a random sample of 50 students from among 40,000, a 91 percent confidence interval on the mean height of all 40,000 students was found to be the interval from 66 inches to 69.2 inches. Select the correct statement below:About 91 percent of all 40,000 students

have heights between 66 and 69.2.About 91 percent of the heights in the

sample should be between 66 and 69.2The probability that the mean height is

between 66 and 69.2 is 91 percent.About 91 percent of all samples would

produce intervals containing μ

Review for Final Exam – Practice

• In a test of hypotheses, data are deemed to be significant at level α=0.05, but not significant at level α=0.01. Which of the following is true about the P-value associated with this test?P-value is greater than 0.05.P-value is between 0.01 and 0.05.P-value is less than 0.01.Nothing can be said.

Review for Final Exam – Practice

• Sample / Population• Statistics / Parameters• Random sampling design

Simple random sample (SRS)Stratified random sampleCluster sampleMultistage sample

• Use random digits to draw simple random samples

Review for Final Exam

• Law of large numbers• Probability: Sample space / Events• Rules for probability model:

1. for any event A, 0 ≤ P(A) ≤ 1.2. for sample space S, P(S) = 1.3. if two events A and B are disjoint, then

P(A or B) = P(A) + P(B).4. for any event A,

P(A does not occur) = 1 - P(A).5. For two independent events A and B,

P(A and B) = P(A) X P(B).• Venn diagram

Review for Final Exam

• General Addition Rule:For two events A and B,

P(A or B) = P(A) + P(B) – P(A and B).

• General Multiplication RuleFor two events A and B,

P(A and B) = P(B|A) X P(A).

• Conditional probability

• Independence: P(B|A) = P(B).

P(A)

B) andP(A A)|P(B

Review for Final Exam

• Random variable:A random variable is a variable whose value

is a numerical outcome of a random phenomenon.

• Distribution:The probability distribution (distribution) of

a random variable tells us what values this random variable can take and how to assign probabilities to those values.

Review for Final Exam

• Statistics are random variables.Sample proportionSample mean

• Central limit theorem• Sampling distributions of statistics

Review for Final Exam

• Sampling distribution of the sample proportion p for an SRS of size n:mean of p equals the population proportion

p;standard deviation of p equals

If the sample size is large, then p is approximately Normal, that is,

;)1(

n

pp

.)1(

,~ˆ

n

pppNp

^

^

^

^

Review for Final Exam

• Sampling distribution of the sample mean x for an SRS of size n:mean of x equals the population mean μ;standard deviation of x equals , where σ

is the

population standard deviation; if the sample size is large, then x is

approximately normal, that is,

if the population has a normal distribution, then the approximation is exact.

n

; ,~

n

σNx

_

_

_

_

Review for Final Exam

• Motor vehicles sold to individuals are classified as either cars or light trucks (including SUVs) and as either domestic or imported. In a recent year, 69% of vehicles sold were light trucks, 78% were domestic, and 55% were domestic light trucks. For a randomly selected vehicle, what is the probability thatthe vehicle is a car?the vehicle is either domestic or a light

truck or both?the vehicle is an imported light truck?the vehicle is a domestic if we know it is a

car?

Review for Final Exam – Practice

• 56% of all American workers have a workplace retirement plan, 66% have health insurance, and 73% have at least one of the benefits. We select a worker at random.What is the probability that he has both

health insurance and a retirement plan?What is the probability that he has neither

health insurance nor a retirement plan?What is the probability that he only has a

retirement plan?Knowing that he has a retirement plan, what

is the probability that he has health insurance?

Review for Final Exam – Practice

• Solution:Let A be the event that he has a retirement

plan.Let B be the event that he has health

insurance.Then P(A)=0.56, P(B)=0.66, and P(A or

B)=0.73.

A

B

A

B

B

A

Review for Final Exam – Practice

• Solution:What is the probability that he has both

health insurance and a retirement plan?P(A and B)=?General addition rule:

P(A or B) = P(A) + P(B) - P(A and B)Therefore, P(A and B) = P(A) + P(B) - P(A

or B) = 0.56+0.66-0.73 = 0.49

B

A

Review for Final Exam – Practice

• Solution:What is the probability that he has neither

health insurance nor a retirement plan?The probability that he has at least one

benefit is 0.73.Therefore, the probability that he has

neither health insurance nor a retirement plan is 1-0.73=0.27.

B

A

Review for Final Exam – Practice

• Solution:What is the probability that he only has a

retirement plan?“Only has a retirement plan” means has a

retirement plan but no health insurance (not both).

Therefore, P(he only has a retirement plan) = P(A) – P(A and B) = 0.56-0.49 = 0.07

B

A

Review for Final Exam – Practice

• Solution:Knowing that he has a retirement plan, what

is the probability that he has health insurance?

.875.056.0

49.0

P(A)

A) and P(BA)|P(B

Review for Final Exam – Practice

• Spell-checking software catches “nonword errors” that result in a string of letters that is not a word, as when “the” is typed as “teh.” When undergraduates are asked to type a 250-word essay (without spell-checking), the number X of nonword errors has the following distribution:

• For a randomly selected student, what is the probability thathe made 4 or more errors?he made at most 1 error?

• For four randomly selected student, what is the probability thateach of them made no more than 2 errors?at least one of them made an error?

Review for Final Exam – Practice

X 0 1 2 3 >=4

Probability 0.1 0.2 0.3 0.3 ?

• In a large Statistics lecture, the professor reports that 52% of the students enrolled have never taken a Calculus course, 34% have taken only one semester of Calculus, and the rest have taken two or more semesters of Calculus. The professor randomly assigns students to groups of three to work on a project for the course.What is the probability that the first group

member you meet has studied some Calculus?What is the probability that the first group

member you meet has studied no more than one semester of Calculus?

What is the probability that both of your two group members have studied exactly one semester of Calculus?

What is the probability that at least one of your group members has had more than one semester of Calculus?

Review for Final Exam – Practice

• Solution:Let A denote the event that a student has

never taken a Calculus courseLet B denote the event that a student has

taken only one semester of CalculusLet C denote the event that a student has

taken two or more semesters of Calculus.

A B C

Review for Final Exam – Practice

• Solution:First, we can find the probability that a

student has taken two or more semesters of Calculus:P(C) = 1–P(A)–P(B) = 1-0.52-0.34=0.14.

What is the probability that the first group member you meet has studied some Calculus?{Some Calculus} = B or CP(Some Calculus) = P(B or C) = P(B)

+P(C) = 0.34+0.14 = 0.48.

Review for Final Exam – Practice

• Solution:What is the probability that the first group

member you meet has studied no more than one semester of Calculus?C = {a student has taken two or more

semesters of Calculus}CC = {a student has studied no more

than one semester of Calculus}P(no more than one semester of

Calculus) = P(CC) = 1-P(C) = 1-0.14 = 0.86.

Review for Final Exam – Practice

• Solution:What is the probability that both of your two

group members have studied exactly one semester of Calculus?The two events

A1={first member has studied exactly one semester of Calculus}A2={second member has studied exactly one semester of Calculus}

are independent.Thus, P(both members have studied

exactly one semester of Calculus) = P(A1 and A2) = P(A1)XP(A2) = 0.34X0.34 = 0.1156

Review for Final Exam – Practice

• Solution:What is the probability that at least one of your

group members has had more than one semester of Calculus?Let E={at least one of your group members

has had more than one semester of Calculus}EC={neither of your group members has had

more than one semester of Calculus}E1={first members does not have had more than one semester of Calculus}E2={second members does not have had more than one semester of Calculus}

P(EC) = P(E1 and E2) = P(E1)XP(E2) = (1-0.14)2.P(E) = 1-P(EC) = 1-(1-0.14)2 = 0.2604.

Review for Final Exam – Practice

• A North American roulette wheel has 38 slots, of which 18 are red, 18 are black, and 2 are green. If you bet on red, the probability of winning is 18/38 = .4737. The probability .4737 represents

(A) nothing important, since every spin of the wheel results in one of three outcomes (red, black, or green).(B) the proportion of times this event will occur in a very long series of individual bets on red.(C) the fact that you're more likely to win betting on red than you are to lose.(D) the fact that if you make 100 wagers on red, you'll have 47 or 48 wins.

Review for Final Exam – Practice

• A company has developed a new battery, but the average lifetime is unknown. In order to estimate this average, a sample of 100 batteries is tested and the average lifetime of this sample is found to be 250 hours. Here the population of interest is:

100 batteries, which were tested / average of 250 hours/ all newly developed batteries by the company / lifetime of newly developed batteries

Here the sample is:100 batteries, which were tested /

lifetime of newly developed batteries / average of 250 hours / not in the list

Review for Final Exam – Practice

• A company has developed a new battery, but the average lifetime is unknown. In order to estimate this average, a sample of 100 batteries is tested and the average lifetime of this sample is found to be 250 hours. What is the parameter of interest in this

case?average lifetime of 100 batteries tested /

average of all newly developed batteries by the company / 100 batteries sampled and tested / no parameter is involved in this problem

The 250 hours is the value of:parameter / statistic / sample / variable

Review for Final Exam – Practice

• There are 30 problems in Ch12 in 4 pages and 45 problems in Ch13 in another set of 4 pages. In order to make up a homework set based on chapters 12 and 13 the instructor considers the following different schemes. Identify the sampling scheme employed. Method 1: Label the 75 problems from 1 through 75

and draw 10 numbers at random and choose the corresponding problems.Simple Random Sampling

Method 2: Pick 4 problems from the 30 in chapter 12 and pick 6 problems from the 45 in chapter 13.Stratified Random Sampling

Method 3: Pick two pages at random and assign all the problems in those pagesCluster Sampling

Method 4: Pick two pages at random and pick 5 problems at random from each of those two pages.Multistage Sampling

Review for Final Exam – Practice

• A student group has 8 members:1. Barrett 2. Chen 3. DeRoos 4. Maceli5. Pagliarulo 6. Smithson 7. Williams 8. Zachary

Three of them will be selected to participate a national conference. If we use the following random digits (start from the left) to select a simple random sample of size 3, then who will attend the conference?

2023967 8523610 4317063 5689043 5463038 9406022

A. Barrett, Chen, DeRoos B. Chen, Chen, DeRoosiC. Chen, DeRoos, Smithson D. Chen, Pagliarulo, Williams

Review for Final Exam – Practice

• Data / Data table• Cases• Variables (Categorical / Quantitative)• Display Categorical Variables

Frequency Table / Relative Frequency Table

Bar Chart / Relative Frequency Bar Chart / Pie Chart

Review for Final Exam

• Graphic techniques for displaying quantitative variables:HistogramsStem-and-leaf displays

• Shape of distributions:Unimodal / Bimodal / Multimodal / UniformSymmetric / Skewed to the left / Skewed

to the rightOutlier

Review for Final Exam

• Numerical descriptions for the distribution of a quantitative variable : The center of a distribution

MeanMedian

The spread of a distributionStandard deviationInterquartile Range (IQR)

Five number summary / Outlier (1.5IQR rule)Boxplot

Review for Final Exam

• Shifting and rescaling of quantitative variables• Standardization of quantitative variables (z-

score)

• The Normal modelMean and standard deviation68-95-99.7 ruleTwo types of problems:

Find percentageFind percentiles

sx-xz

Review for Final Exam

• Scatterplot for two quantitative variablesDirection

positive / negativeForm

linear / curved / no patternStrength

strong / moderate / weak• Correlation coefficient r

Review for Final Exam

• Linear models

• Least square regression line

• Predictions and residuals

xbby10

ˆ

xbybs

srb

x

y101 and

Review for Final Exam

• The mean height of American women in their early twenties is about 64.5 inches and the standard deviation is about 2.5 inches. The mean height of men the same age is about 68.5 inches, with standard deviation about 2.7 inches. If the correlation between the heights

• of husbands and wives is about r = 0.5, what is the equation of the regression line of the husband’s height on the wife’s height in young couples? Predict the height of the husband of a woman who is 67 inches tall. What percentage of variation in husbands’ height is explained by wives’ height?

Review for Final Exam – Practice

• Michigan State University researchers want to investigate how rainfall affects the yield of crops in East Lansing. The researchers found that the average amount of rainfall over the past 20 years is about 230 inches and the standard deviation is about 10 inches. The average yield of crops in East Lansing is about 280 tones with a standard deviation of 20 tones. The correlation between the amount of rainfall and yield of crops is about 0.4. 1) What is the slope of the regression line of yield

of crop on amount of rainfall?2) What is the intercept of the appropriate

regression line?3) What is the predicted value of the yield of crop

when the amount of rainfall is 240 inches? If the actual yield of crop of the year with rainfall 240 inches is 280, what is the residual?

4) What percentage of variation in crop yield is explained by the rainfall?

Review for Final Exam – Practice

• Solution:1) What is the slope of the regression line of

yield of crop on amount of rainfall?The slope is given by

HereThus the slope is

2) What is the intercept of the appropriate regression line?The intercept is given byHereThus the intercept is

x

y

s

srb 1

.20,10,4.0 yx ssr

8.010

204.01 b

xbyb 10 .8.0,280,230 1 byx

.96)8.0(2302800 b

Review for Final Exam – Practice

• Solution:3) What is the predicted value of the yield of crop

when the amount of rainfall is 240 inches? If the actual yield of crop of the year with rainfall 240 inches is 280, what is the residual?The predicted value is

The residual is

4) What percentage of variation in crop yield is explained by the rainfall?The quantity r2 tells us the percentage of

changes in the response variable which are explained by the changes in explanatory variable. In this case, r2=0.42=0.16.

.8288280ˆ yy

.288)240(8.0968.096ˆ xy

Review for Final Exam – Practice

• In a population of couples the average height of wives' was 65.2 inches and that of the husbands 68.2 inches. You use the regression line to make predictions of the wife's height from the husband's height. Suppose a husband has height 68.2 inches, what would be the predicted height of the wife?

• Solution:The regression line satisfies

Since the husband’s height (68.2 inches) is same as the average height of husbands, the predicted height of the wife should also be the average height of wives, that is, 65.2 inches.

xbby 10

Review for Final Exam – Practice

• A regression study on obesity shows that doing more physical exercises reduces weight. In this study they have found time spent in physical exercise explained 16% of the total sample variation in weight among obese people. What is the correlation between "time spent in physical exercise" and "weight"?

• Solution:The quantity r2 tells us the percentage of changes in the response variable which are explained by the changes in explanatory variable. In this case, r2=0.16. So the correlation is r=0.4.

Review for Final Exam – Practice

• Suppose that in families with 5 children X is the number of boys and Y is the number of girls. What is the correlation between X and Y?

• Solution:Since X+Y=5, or equivalently Y =5-X, X

and Y are linearly related. Therefore, the correlation between X and

Y is -1.

Review for Final Exam – Practice

• Which scatterplot has correlation near zero?

Review for Final Exam – Practice

• In a photographic process, the developing time of prints are approximately normal with mean 15.4 seconds and standard deviation 0.4 seconds. 1) What proportion of prints will take at least

14.64 sec to develop? 2) What proportion of prints will take 14.64

sec to 16.00 sec to develop?3) How many seconds is needed at most for

the quickest 10%?

Review for Final Exam – Practice

• Solution:1) What proportion of prints will take at least

14.64 sec to develop?The z-score corresponding to 14.64 is

The probability corresponding to z-score -1.9 is 0.0287.

Therefore, the proportion of prints that will take at least 14.64 sec to develop is 1-0.0287=0.9713.

.9.14.0

4.1564.14

x

z

Review for Final Exam – Practice

• Solution:1) What proportion of prints will take 14.64

sec to 16.00 sec to develop?The z-score corresponding to 16 is

The probability corresponding to z-score 1.5 is 0.9332.

Therefore, the proportion of prints will take 14.64 sec to 16.00 sec to develop is 0.9332-0.0287=0.9045.

.5.14.0

4.1516

x

z

Review for Final Exam – Practice

• Solution:1) How many seconds is needed at most for

the quickest 10%?Quickest 10% corresponds to the

smallest 10% (less time). The z-score corresponding to probability

0.1 is -1.28.Therefore, the seconds needed at most

for the quickest 10% is

.28.104.0)28.1(4.15 zx

Review for Final Exam – Practice

• Which seems to be the likely value of Q1 (the first quartile)?

• Which seems to be the likely value of the median?

• What percentage of the observations is lying outside the box?

• What is the approximate value of the range?

22

48

50%

110-5=105

Review for Final Exam – Practice

• The following stem-and-leaf display shows the number of patients attended by a house-physician in 15 randomly selected weeks:Stem | Leaf---------------------------- 0 | 8 9 1 | 3 4 6 6 6 8 8 2 | 0 1 2 4 3 | 0 6Here 0|8 implies 8, 1|3 implies 13 etc. (i.e. the stem represents tens and leaf represents units). 1) Which observation occurred most?2) How many weeks the physician had to attend

between 15 to 25 patients? 3) What is the median, Q1, and Q3?4) What is the IQR?5) Are there any outliers?

16

9

Median:18; Q1:14; Q3:22

IQR=Q3-Q1=22-14=836 is an outlier

Review for Final Exam – Practice

• What is the mean and standard deviation of the data set {34, 40, 43, 55}?

• Solution:Mean:

Standard deviation:

.434

55434034

x

Review for Final Exam – Practice

34 40 43 55 sum

-9 -3 0 12

81 9 0 144 234xx

2)( xx

832.814

234

1

)( 2

n

xxs

• An airline company keeps track of the delay in its flights. Generally most flights have small delays but there are a few flights with very long delays. A consumer group claims that the "average" delay is 740 minutes while the airline company claims that the average is only 260 minutes. Why is the difference?

• Solution:The consumer group refers to the mean

while the company refers to median.The distribution is skewed to the right. So

the mean is larger than the median.

Review for Final Exam – Practice

• To decide whether to provide electrical power using overhead lines or underground lines, the state administration has to consider the total lengths of street (measured in mile) in each subdivision of the respective state. Below is the histogram of street lengths of 47 subdivisions in a state.

Review for Final Exam – Practice

Number of subdivisions

10+7=17

12/47=25.5%

The median is the 24th observation

Med

ian

• What is plotted along the Y-axis (the vertical axis)?• How many subdivisions have total length of street

between 2000 and 4000 miles?• What percent of subdivisions have total length less than

1000 miles?• Which seems more likely to be true?

1) Mean = Median; Mean < Median;Mean > Median

• Which class will the median street length be in?

Review for Final Exam – Practice

• In order to plan transportation and parking needs, the administrations of a private high school asked students how they get to school. Some rode a school bus, some rode in with parents or friends, and others used "personal" transportations - bikes, skateboards, or just walking. The following table summarizes the response from boys and girls.

1) How many students takes part in the survey?2) What percentage of students surveyed are girl?3) What percentage of students take school bus?4) What percent of the students are girls who ride

the bus?5) What percent of girls who ride bus?6) What percent of bus riders are girls?

Boy GirlBus 35 32Ride 35 47

Review for Final Exam – Practice

• Solution:

1) How many students takes part in the survey?35+35+32+47=149.

2) What percentage of students surveyed are girl?(32+47)/149=53.0%.

3) What percentage of students take school bus?(35+32)/149=45.0%.

4) What percent of the students are girls who take the bus?32/149=21.5%.

5) What percent of girls who ride bus?32/(32+47)=40.5%.

6) What percent of bus riders are girls?32/(32+35)=47.8%.

Boy GirlBus 35 32Ride 35 47

Review for Final Exam – Practice