Probability: Hypothesis

download Probability: Hypothesis

of 35

  • date post

    20-Oct-2015
  • Category

    Documents

  • view

    48
  • download

    2

Embed Size (px)

description

Hypothesis testing

Transcript of Probability: Hypothesis

  • Interval Estimation Estimation of Proportion Test of Hypotheses Null Hypotheses and Tests of Hypotheses Hypotheses Concerning One mean Hypotheses Concerning One Proportion

  • Interval Estimation

    Large sample confidence intervalfor - known n

    zxn

    zx +

  • Interval Estimation (contd)

    Example: To estimate the average time it takes to assemble a certain computer component, the industrial engineer at an electronic firm timed 40 technicians in the performance of this task, getting amean of 12.73 minutes and a standard deviation 2.06 minutes.

    (a) What can we say with 99% confidence about the maximum error if is used as a point estimate of the actual average time required to do this job?

    (b) Use the given data to construct a 98% confidence interval for the true average time it takes to assemble the computer component.

  • Interval Estimation (contd)Solution: Given s = 2.06 and n = 40 (a) and (1 - ) = 0.99 = 0.01

    Since sample is large (n = 40) The maximum error of estimation with 99% confidence is

    73.12=x

    839.04006.2575.2005.02/ ==== n

    sznszE

    (b) 98% confidence interval (i.e. = 0.02 ) is given by

    .489.13971.114006.233.273.12

    4006.233.273.12

    01.001.0

  • Interval Estimation (contd)Example: With reference to the previous example with what confidence

    we can assert that the sample mean does not differ from the true mean by more than 30 seconds.

    Solution: Given E = 30 seconds = 0.5 minute, s = 2.06, n = 40 and we have to get value of (1 - ).

    54.106.240)5.0(2/2/2/ ==== z

    nEznszE

    8764.011236.09382.012

    )3Tablefrom(9382.0)54.1()( 2/

    =====

    FzF

    Thus, we have 87.64% confidence that the sample mean does not duffer from the true mean by more than 30 seconds.

  • Estimation of Proportion (contd)When n is large, we can construct approximate confidence

    intervals for the binomial parameter p by using the normal approximation to the binomial distribution. Accordingly, we can assert with probability 1 - that the inequality

    will be satisfied. Solving this quadratic inequality for p we can obtain a corresponding set of approximate confidence limits for p in terms of the observed value of x but since the necessary calculations are complex, we shall make the further approximation of substituting x/n for p in

    2/2/)1(

    zpnp

    npXz

  • Estimation of Proportion (contd)

    Large sample confidence interval for p

    where the degree of confidence is (1 - )100%.Maximum error of estimate

    nnx

    nx

    znxp

    nnx

    nx

    znx

    +

  • Estimation of Proportion (contd)

    Sample size determination

    But this formula cannot be used as it stands unless we have some information about the possible size of p. If no much information is available, we can make use of the fact that p(1 - p) is at most 1/4, corresponding to p = 1/2 , as can be shown by the method of elementary calculus. If a range for p is known, the value closest to 1/2 should be used.

    Sample size (p unknown)

    22/)1(

    =E

    zppn

    22/

    41

    =E

    zn

  • Test of HypothesesThere are many problems in which rather than estimating the

    value of parameter we are interested to know whether a statement concerning a parameter is true or false; that is, we test a hypothesis about a parameter.

    To illustrate the general concepts involved in deciding whether or not a statement about the population is true or false, suppose that a consumer protection agency wants to test a paint manufacturers claim that the average drying time of his new fast-drying paint is 20 minutes. It instructs a member of its research staff to take 36 boards and paint them with paint from 36 different 1-gallon cans of the paint, with intention of rejecting the claim if the mean drying times exceeds 20.75 minutes otherwise, it will accept the claim and in either case it will take whatever action is called for in its plans.

  • Test of Hypotheses (contd) This provides a clear-cut criterion for accepting or rejecting the claim,

    but unfortunately it is not infallible. Since the decision is based on a sample, there is the possibility that the sample mean may exceeds 20.75 minutes even though the true mean drying time is = 20 minutes and there is also possibility that the sample mean may be 20.75 minutes or less even though the true mean drying time is, say, = 21 minutes.

    Thus before adopting the criterion, it would seem wise to investigate the chances that the criterion may lead to a wrong decision.

    Assuming that it is known from past experience that =2.4 minutes, let us first investigate the probability that the sample mean may exceeds 20.75 minutes even though the true mean drying time is = 20. Assuming the population is large enough to be treated as an infinite.

  • Test of Hypotheses (contd)

    0304.09696.01)875.1(1)875.1(

    36/4.22075.20

    /)75.20(

    ====

    =

    FZPn

    XPXP

    = 20 20.75 MinutesxAccept the claimthat = 20

    Reject the claimthat = 20

    0.0304

    Hence the probability of erroneously reject the hypothesis = 20 minutes is approximately 0.0304.

    Figure: Probability of falsely rejecting claim

  • Test of Hypotheses (contd)Consider the other possibility where the procedure fails to detect that

    > 20 minutes. Suppose that true mean drying time is = 21 minutes so calculate the probability of getting a sample mean less than or equal to 20.75 minutes and hence erroneously accepting the claim that = 20 minutes.

    Accept the claimthat = 20

    Reject the claimthat = 20

    20.75 = 21MinutesxFigure: Probability of failing

    to reject claim

    0.26602660.0

    )625.0()625.0(36/4.22175.20

    /)75.20(

    ===

    =FZP

    nXPXP

  • Test of Hypotheses (contd)The situation described in this example is typical of testing a

    statistical hypothesis and it may be summarized in the following table, where we refer to the hypothesis being used as hypothesis H:

    Correct decisionType II error H is falseType I errorCorrect decisionH is true

    Reject HAccept H

    If hypothesis H is true and not rejected or false and rejected, the decision is in either case correct. If hypothesis H is true but rejected, it is rejected in error, and hypothesis H is falsebut not rejected, this is also an error.

  • Test of Hypotheses (contd) The first of these errors is called a Type I error. The

    probability of committing it when the hypothesis is true, is designated by the Greek letter (alpha). The second error is called a Type II error and the probability of committing it is designated by the Greek letter (beta). Thus in the above example we showed that for the given test criterion = 0.03 and = 0.27 when = 21 minutes.

    In calculating the probability of a type II error in our example we arbitrarily chose the alternative value = 21 minutes. However, in this problem as in most others, there are infinitely many other alternatives, and for each one of them there is a positive probability of erroneously accepting the hypothesis H.

  • Null Hypotheses and Tests of Hypotheses

    Null hypothesis: It is a hypothesis in which we hypothesize the opposite of what we hope to prove.

    For example, if we want to show that one method of teaching computer programming is more efficient than another, we hypothesize that two methods are equally effective. Since we hypothesize that there is no difference in the effectiveness of the two teaching methods, we call hypothesis like this null hypothesis and denote by H0.

  • Guideline for selecting the null hypothesisWhen the goal of an experiment is to establish an assertion,

    the negation of the assertion should be taken as the null hypothesis. The assertion becomes the alternative hypothesis.

    Notation for the hypothesesH1: The alternative hypothesis is the claim we wish to

    establish.H0: The null hypothesis is the negation of the claim

    Null Hypotheses and Tests of Hypotheses (contd)

  • Null Hypotheses and Tests of Hypotheses (contd)Example: A process for making steel pipe is under control if

    the diameter of the pipe has a mean of 3.0000 inches with a standard deviation of 0.0250 inch. To check whether the process is under control, a random sample of size n = 30 is taken each day and the null hypothesis = 3.0000 is rejected if is less than 2.9960 or greater than 3.0040. Find

    (a) the probability of a Type I error; (b) the probability of a Type II error when = 3.0050

    inches.

    X

  • Null Hypotheses and Tests of Hypotheses (contd)Solution: n = 30, = 3.0000, = 0.0250

    ( )

    381.0)876.0(2)876.0(1)876.0(on.distributi normal standard

    ely approximat with variablerandom be will/

    since

    )876.0(1876.030/0250.0

    0000.30040.3/30/0250.0

    0000.39960.2/

    )0040.3()9960.2(

    )0040.3or 9960.2( error) I Type()a(

    ==+=

  • Null Hypotheses and Tests of Hypotheses (contd)

    )219.97.1(30/0250.00050.30040.3

    /30/0250.00050.39960.2

    )0050.3when ()0040.39960.2(error) II Type()b(

  • Null Hypotheses and Tests of Hypotheses (contd)Example: Suppose that for a given population with = 8.4

    in2 we want to test the null hypothesis = 80.0 in2against the alternative hypothesis < 80.0 in2 on the basis of a random sample of size n = 100.

    (a) If the null hypothesis is rejected for < 78.0 in2 and otherwise it is accepted, what is the probability of a Type I error?

    (b) What is the answer to part (a) if the null hypothesis is 80.0 in2 instead