of 28 /28
of these students is 23 or higher? Solutions Given that, µ=20.8 and SD δ=4.8 (a) Required Prob: P( X >= 23 ) P( X >= 23 ) = 1 - P( X < 23 ) P( X < 23 ) 0.6766 (by using excel normdist() function, P( X >= 23 ) = 1 - P( X < 23 ) = 1-0.67 0.3234 (b) For this problem lets recollect that By the properties of means and variances of random variables, the m Note: Here x-bar also refer as M Now, when n=25 ( c) Required Prob: P( sample mean(M) >= 23 ) P( M >= 23 ) = 1 - P( M < 23 ) P( M < 23 ) 0.9890 (by using eMcel normdist() function, P( M >= 23 ) = 1 - P( M < 23 ) = 1-0.98 0.0110 (d) Normal probability calculation in (c) is more accurate, because it whereas (a) captures overall population's variability #5.48 ACT scores of high school seniors. The scores of high school se (a) What is the approximate probability that a single student (b) Now taken as SRS of 25 students who took the test. What are (c) What is the approximate probability that the mean score (d) Which of your two Normal probability calculations in (a) an Expected value of M = 20.8 Standard deviation of M = σ / sqrt(n) = 4.8 / sqrt(25)

Embed Size (px)

### Transcript of Statistics homework help, statistics tutoring, statistics tutor by onlinetutorsite of these students is 23 or higher?

Solutions: Given that, µ=20.8 and SD δ=4.8

(a) Required Prob: P( X >= 23 )

P( X >= 23 ) = 1 - P( X < 23 )

P( X < 23 ) = 0.6766 (by using excel normdist() function, click on the respective value for how it is calculated)

P( X >= 23 ) = 1 - P( X < 23 ) = 1-0.6766 = 0.3234

(b)For this problem lets recollect thatBy the properties of means and variances of random variables, the mean and variance of the sample mean are the following:

Note: Here x-bar also refer as M

Now, when n=25

( c)

Required Prob: P( sample mean(M) >= 23 )

P( M >= 23 ) = 1 - P( M < 23 )

P( M < 23 ) = 0.9890 (by using eMcel normdist() function, click on the respective value for how it is calculated)

P( M >= 23 ) = 1 - P( M < 23 ) = 1-0.9890 = 0.0110

(d)

Normal probability calculation in (c) is more accurate, because it is capturing the current sample's variability, whereas (a) captures overall population's variability

#5.48 ACT scores of high school seniors. The scores of high school seniors on the ACT college entrance examination in 2003 had mean µ=20.8 and SD δ=4.8. The distribution of scores is only roughly Normal.(a)    What is the approximate probability that a single student randomly chosen from all those taking the test scores 23 or higher?(b)   Now taken as SRS of 25 students who took the test. What are the mean and standard deviation of the sample means score x of these 25 students?(c)    What is the approximate probability that the mean score

(d)   Which of your two Normal probability calculations in (a) and (c) is more accurate? Why?

Expected value of M = 20.8

Standard deviation of M = σ / sqrt(n) = 4.8 / sqrt(25) = 4.8 / 5 = 0.96 (by using excel normdist() function, click on the respective value for how it is calculated)

By the properties of means and variances of random variables, the mean and variance of the sample mean are the following:

(by using eMcel normdist() function, click on the respective value for how it is calculated)

Normal probability calculation in (c) is more accurate, because it is capturing the current sample's variability,

The scores of high school seniors on the ACT college entrance examination in 2003 had mean µ=20.8 and SD δ=4.8. The distribution of scores is only roughly Normal.What is the approximate probability that a single student randomly chosen from all those taking the test scores 23 or higher?Now taken as SRS of 25 students who took the test. What are the mean and standard deviation of the sample means score x of these 25 students? the winning number, which is drawn at random. Here is the distribution of the payoff X:Payoff X \$0 \$500

Probability 0.999 0.001

Each day’s drawing is independent of other drawings.

What is the probability that Joe ends the year ahead?

Solutions:(a)x \$0 \$500 Imp note: Generally there should be loss factor, here it should be -1 in place of 0

P(x) 0.999 0.001 but considering as it is to avoide the confusion

mean = Σxp(x) = 0*0.999 + 500*0.001 = 0.5

SD = sqrt ( V(x) )

V(x) = Σ x*x*p(x) - (mean*mean) = 0*0*0.999 + 500*500*0.001 - 0.5*0.5 = 249.75

SD = sqrt ( 249.75 ) = 15.803

(b)From the law of large numbers, the average payoff joe receives from his bets will be close population's average payoff

(c)The central limit theorem (CLT) states conditions under which the mean of a sufficientlylarge number of independent random variables, each with finite mean and variance, will be approximately normally distributed

(d)Required prob: P( X > 1)

As here, n = 104Mean of sample mean = 0.5SD of sample mean = σ / sqrt(n) = 15.803 / sqrt(104) = 1.5496therefore X follows normal dist with mean = 0.5 and SD=1.5496

P(X>1) = 1 - P(X<=1) = = 1 - 0.63

= 0.37

#5.52 A Lottery payoff. A \$1 bet in a state lottery’s Pick 3 game pays \$500 if the three-digit number you choose exactly matches

(a)    What are the mean and SD of x?(b)   Joe buys a Pick 3 ticket twice a week. What does the law of large numbers say about the average payoff Joe receives from his bets?(c)    What does the central limit theorem say about the distribution of Joe’s average payoff after 104 bets in a year?(d)   Joe comes out ahead for the year if his average payoff is greater than \$1 (the amount he spent each day on a ticket). Imp note: Generally there should be loss factor, here it should be -1 in place of 0but considering as it is to avoide the confusion

From the law of large numbers, the average payoff joe receives from his bets will be close population's average payoff

large number of independent random variables, each with finite mean and variance, will be approximately normally distributed

. A \$1 bet in a state lottery’s Pick 3 game pays \$500 if the three-digit number you choose exactly matches

Joe buys a Pick 3 ticket twice a week. What does the law of large numbers say about the average payoff Joe receives from his bets?What does the central limit theorem say about the distribution of Joe’s average payoff after 104 bets in a year?Joe comes out ahead for the year if his average payoff is greater than \$1 (the amount he spent each day on a ticket). between the mean scores in the two groups?

Solutions:(a)

Because the central limit theorem (CLT) states conditions under which the mean of a sufficientlylarge number of independent random variables, each with finite mean and variance, will be approximately normally distributed So based on this CLT,we could say that mean score for a group of 28 will be close to Normal

(b)

Given that, Journal's mean 4.8 and SD 1.5, and Enquirer's mean 2.4 and SD 1.6

here sample number, n=28

By the properties of means and variances of random variables, the mean and variance of the sample mean are the following:

Now, for jounral when n=28

0.283

Now, for Enquirer when n=28

Expected value of M of Enquirer = 2.4 Note: Here x-bar refers as M

0.302

(c)

The distribution of y-bar - x-bar would be normal based central limit theorem and normal dist properties

From CLT,we already know that y-bar and x-bar will be normal and form normal dist properties, if x1 and x2 follow normal dist then x1+x2 and x1-x2 also follows normal dist

(d)

for required prob, first we will need to calculate mean and SD of y-bar - x-bar

using (b) and (c ), y-bar mean = 4.8 and var = square of 0.283 = 0.0804

#5.60 Advertisements and brand image. Many companies place advertisements to improve the image of their brand rather than to promote specific products. In a randomized comparative experiment, business students read ads that cited either the Wall Street Journal or the National Enquirer for important facts about a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6(a)    There are 28 students in each group. Although individual scores are discrete, the mean score for a group of 28 will be close to Normal. Why?(b)   What are the means and SD of the sample mean scores ӯ for the Journal group and ̅for the Enquirer group?(c)    We can take all 56 scores to be independent because students are not told each other’s scores. What is the distribution of the difference

(d)    Find P (ӯ - x-bar ≥1 ).

Expected value of ӯ of journal = 4.8

Standard deviation of ӯ = σ / sqrt(n) = 1.5 / sqrt(28) = 4.8 / 5 =

Standard deviation of M = σ / sqrt(n) = 1.6 / sqrt(28) = 4.8 / 5 = x-bar mean = 2.4 and var = square of 0.302 = 0.0914

y-bar - x-bar follows normal dist with mean = 4.8-2.4 = 2.4 and var = 0.0804 + 0.0914 = 0.1718SD = sqrt(0.1718) =

P (ӯ - x-bar ≥1 ) = 1 - P (ӯ - x-bar<1) = 1 - 0.0004

= 0.9996 Because the central limit theorem (CLT) states conditions under which the mean of a sufficientlylarge number of independent random variables, each with finite mean and variance, will be approximately normally distributed

By the properties of means and variances of random variables, the mean and variance of the sample mean are the following:

Note: Here x-bar refers as M

The distribution of y-bar - x-bar would be normal based central limit theorem and normal dist properties

From CLT,we already know that y-bar and x-bar will be normal and form normal dist properties,

. Many companies place advertisements to improve the image of their brand rather than to promote specific products. In a randomized comparative experiment, business students read ads that cited either the Wall Street Journal or the National Enquirer for important facts about a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6There are 28 students in each group. Although individual scores are discrete, the mean score for a group of 28 will be close to Normal. Why?

We can take all 56 scores to be independent because students are not told each other’s scores. What is the distribution of the difference ӯ- y-bar - x-bar follows normal dist with mean = 4.8-2.4 = 2.4 and var = 0.0804 + 0.0914 = 0.17180.4145 . Many companies place advertisements to improve the image of their brand rather than to promote specific products. In a randomized comparative experiment, business students read ads that cited either the Wall Street Journal or the National Enquirer for important facts about a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6 for important facts about a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6 for important facts about a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6 have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6 in the same study was osteocalcin (OC), measured in the blood. The units are nanograms per milliliter (ng/ml). For the 31 subjects in the study the mean was 33.4 ng/ml. Assume that the SD is known to be 19.6 ng/ml. report the 95% confidence interval.

Confidence Interval Estimate for the Mean

DataPopulation Standard Deviation 19.6Sample Mean 33.4Sample Size 31Confidence Level 95%

Intermediate CalculationsStandard Error of the Mean 3.5203 Note: Click on corresponding cell, to know how it is calculatedZ Value -1.9600Interval Half Width 6.8996

Confidence IntervalInterval Lower Limit 26.5004Interval Upper Limit 40.2996

#6.18 Mean OC in young women. Refer to the previous exercises. A biomarket for bone formations measured in the same study was osteocalcin (OC), measured in the blood. The units are nanograms per milliliter (ng/ml). For the 31 subjects in the study the mean was 33.4 ng/ml. Assume that the SD is known to be 19.6 ng/ml. report the 95% confidence interval.

Note: Click on corresponding cell, to know how it is calculated

. Refer to the previous exercises. A biomarket for bone formations measured #6.32 Accuracy of a laboratory scale. To assess the accuracy of a laboratory scale, a standard weight known to weigh 10 grams is weighed repeatedly. The scale readings are Normally distributed with unknown mean (this mean is 10 grams if the scale has no bias.) the SD of the scale readings is known to be 0.0002 gram.

(a)

Confidence Interval Estimate for the Mean

DataPopulation Standard Deviation 0.0002Sample Mean 10.0023Sample Size 5Confidence Level 98%

Intermediate CalculationsStandard Error of the Mean 0.0001 Note: Click on corresponding cell, to know how it is calculatedZ Value -2.3263Interval Half Width 0.0002

Confidence IntervalInterval Lower Limit 10.0021Interval Upper Limit 10.0025

(b)Sample Size Determination

DataPopulation Standard Deviation 0.0002Sampling Error 0.0001Confidence Level 98%

Intemediate CalculationsZ Value -2.326348 Note: Click on corresponding cell, to know how it is calculatedCalculated Sample Size 21.64758

ResultSample Size Needed 22

(a)    The weight is measured five times. The main result is 10.0023 grams. Give a 98% confidence interval for the mean of repeated measurements of the weight.(b)   How many measurements must be averaged to get a margin of error of ±0.0001 with 98% confidence? distributed with unknown mean (this mean is 10 grams if the scale has no bias.) the SD of the scale readings is known to be 0.0002 gram.

Note: Click on corresponding cell, to know how it is calculated

Note: Click on corresponding cell, to know how it is calculated

The weight is measured five times. The main result is 10.0023 grams. Give a 98% confidence interval for the mean of repeated measurements of the weight.How many measurements must be averaged to get a margin of error of ±0.0001 with 98% confidence? Solutions:(a)

No, it does not include the value 30 at 95% confidence intervalBecause generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance levelHere, P-value 0.04 < 0.05 (significance level), so it is rejecting the null hypothesis.Means, it doesn't inlcude 30

(b)No, it does not include the value 30 at 90% confidence intervalBecause generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance levelHere, P-value 0.04 < 0.1 (significance level), so it is rejecting the null hypothesis.Means, it doesn't inlcude 30

#6.58 A two-sided test and the confidence interval. The P-value for a two-sided test of the null hypothesis H0:µ=30 is 0.04.(a)    Does the 95% confidence interval include the value 30? Why?(b)   Does the 90% confidence interval include the value 30? Why? Because generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level

Because generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level

The P-value for a two-sided test of the null hypothesis H0:µ=30 is 0.04. One way to formulate hypotheses about whether or not the trees are randomly distributed in the tract is to examine the average location in the north-south direction. The values range from 0 to 200, so if the trees are uniformly distributed in this direction, any difference from the middle values (100) should be due to chance variation. The sample means for the 584 trees in the tract is 99.74. A theoretical calculation based on the assumption that the trees are uniformly distributed gives a SD of 58. Carefully state the null and alternative hypotheses in terms of this variable. Note that this requires that you translate the research question about the random distribution of the trees into specific statements about the mean of a probability distribution. Test your hypotheses, report your results, and write a short summary of what you have found.

Solution:

Null hypothesis: H0 : The pine trees are randomly distributed north to south (µ = 100)Alternative hypothesis H1 : The pine trees are not randomly distributed north to south (µ <> 100)

Z Test of Hypothesis for the Mean

Data100

Level of Significance 0.05Population Standard Deviation 58Sample Size 584Sample Mean 99.74

Intermediate CalculationsStandard Error of the Mean 2.400057Z Test Statistic -0.108331

Two-Tail TestLower Critical Value -1.959964Upper Critical Value 1.959964

0.913733Do not reject the null hypothesis

As p-value is grater than 0.05 (alpha), we do not reject the null hypothesis. Means, The pine trees are randomly distributed north to south

#6.66 Are the pine trees randomly distributed north to south? In example 6.1 we looked at the distribution of longleaf pine trees in the Wade Tract.

Null Hypothesis m=

p-Value One way to formulate hypotheses about whether or not the trees are randomly distributed in the tract is to examine the average location in the north-south direction. The values range from 0 to 200, so if the trees are uniformly distributed in this direction, any difference from the middle values (100) should be due to chance variation. The sample means for the 584 trees in the tract is 99.74. A theoretical calculation based on the assumption that the trees are uniformly distributed gives a SD of 58. Carefully state the null and alternative hypotheses in terms of this variable. Note that this requires that you translate the research question about the random distribution of the trees into specific statements about the mean of a probability distribution. Test your hypotheses, report your results, and write a short summary of what you have found.

As p-value is grater than 0.05 (alpha), we do not reject the null hypothesis. Means, The pine trees are randomly distributed north to south

In example 6.1 we looked at the distribution of longleaf pine trees in the Wade Tract. Sonnets by a certain Elizabethan poet are known to contain an average of µ=8.9 new words (words are not used in the poet’s other works). The SD of the number of new word is δ=2.5. Now a manuscript with debating whether it is the poet’s work. The new sonnets contain an average of =10.2 words not used in the poet’s known works. We expect poems by another author to contain more new words, so to see if we have evidence that the new sonnets are not by our poet we test

Give the z test statistics and its P-value. What do you conclude about the authorship of the new poems?

Solution:Z Test of Hypothesis for the Mean

Data8.9

Level of Significance 0.05Population Standard Deviation 2.5Sample Size 1Sample Mean 10.2

Intermediate CalculationsStandard Error of the Mean 2.5Z Test Statistic 0.52

Upper-Tail TestUpper Critical Value 1.6449

0.3015Do not reject the null hypothesis

As we are not rejecting the null hypothesis, it is saying that µ=8.9, so the authorship of the new poems is Elizabethan poet

6.68 Who is the author? Statistics can help decide the authorship of literary works.

H0:µ = 8.9

Ha: µ > 8.9

Null Hypothesis m=

p-Value Sonnets by a certain Elizabethan poet are known to contain an average of µ=8.9 new words (words are not used in the poet’s other works). The SD of the number of new word is δ=2.5. Now a manuscript with debating whether it is the poet’s work. The new sonnets contain an average of =10.2 words not used in the poet’s known works. We expect poems by another author to contain more new words, so to see if we have evidence that the new sonnets are not by our poet we test

Give the z test statistics and its P-value. What do you conclude about the authorship of the new poems?

As we are not rejecting the null hypothesis, it is saying that µ=8.9, so the authorship of the new poems is Elizabethan poet