Chi Squared Test

27
HYPOTHESIS TESTING

description

chi square

Transcript of Chi Squared Test

Page 1: Chi Squared Test

HYPOTHESIS TESTING

Page 2: Chi Squared Test

Chi squared test

Page 3: Chi Squared Test

Chi squared test

• Chi squared test is used when the population distribution under study has no parameters

• The symbol χ is used• The sampling distribution χ ² is called χ ²

distribution• The χ ² statistic is compared with its critical

value

Page 4: Chi Squared Test

ASSUMPTIONS

• 1. The experiment consists of n categories but independent trials. The outcome of each trial falls into each of k categories. The observed number in each category written as O1,O2,…On

• 2. If there are only 2 cells the expected frequency in each cell should be 5 or more

• 3. For more than two cells, if more than 20 % of cells have expected frequencies less than 5 then χ ² should not be applied

Page 5: Chi Squared Test

ASSUMPTIONS

• 4.Samples must be drawn randomly from population on interest

• 5. The sample should contain at least 50 observations

• 6. The data should be expressed in original units rather than in percentage or ratio

Page 6: Chi Squared Test

Chi squared test statistic

• χ ² = Σ (O-E) ² / E• O = an observed frequency in particular category• E = expected frequency in particular category• Decision rule– The calculated value of χ ² test statistic is compared

at particular level of significance and degree of freedom

– If χ ²cal > χ ²critical then null hypothesis is rejected in favor of alternate hypothesis

– The degree of freedom for χ ² test statistic depends on the test and certain other factors

Page 7: Chi Squared Test

APPLICATIONS OF χ ² TEST

A few important applications of χ ² are1. Test of Independence2. Test of goodness of fit3. Yale’s correction for continuity4. Test for population variance5. Test for Homogeneity

Page 8: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

• The χ ² test of independence is used to analyze the frequency of two qualitative variables or attributes with multiple categories to determine whether the two variables are independent

• The χ ² test of independence can be used to analyze any level of measurement, but it is particularly useful in analyzing nominal data

Page 9: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

• For e.g.,• Whether voters can be classified by gender is

independent of the political affiliation• Whether university students classified by

gender are independent of courses of study• Whether wage earners classified by education

level are independent of income

Page 10: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

• Contingency Table – When observations (frequencies) are classified according to two qualitative variables or attributes and arranged in a table the display is called a contingency table

Page 11: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

• The value Oij is the observed frequency for the cell in row I and column j

• The ‘total sum’ rows and columns are sum of the frequencies in respective rows and columns.

• ‘N’ is total of frequencies

Variable A

Variable B

A1 A2

…….

Ac Total

B1 O11 O12 …. O1c R1B2 O21 O22 … O2c R2. ….. ….. …Br Or1 Or2 … Orc RrTotal C1 C2 Cc N

Page 12: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

Eij

Row i total x column j total x grand total Sample size sample size Ri x Cj x N Ri x Cj

N N N

=

==

Page 13: Chi Squared Test

Contingency Table Analysis : χ ² test of Independence

• The analysis of two way contingency table helps to answer the question whether the two variable are unrelated or independent of each other

• The null hypothesis for a χ ² test of Independence is that two variables are independent

Page 14: Chi Squared Test

Procedure

• Step 1 –State null hypothesis and alternate hypothesis

Ho : The variables are independent. No relationship exists

H1: A relationship exists• Step 2- – Select a random sample and record observed

frequencies (O values)in each cell of contingency table

– Calculate row, column and grand total

Page 15: Chi Squared Test

Procedure

• Step 3 – calculate the expected frequencies (E values)for each cell

E = (row total x column total) / grand total• Step 4 – Compute the value of test statistic

χ ² = Σ (O-E)² / E• Step 5 – Calculate the degrees of freedom. The

degree of freedom for the χ ² test of independence

df=(number of rows -1)(number of columns -1) = (r-1)(c-1)

Page 16: Chi Squared Test

Procedure

• Step 6 – Using a level of significance α and df find the critical value of χ ²α. This value of corresponds to an area in right tail of the distribution

• Step 7 -Compare the calculated and table value of χ ²

• Decision rule– Accept Ho if χ ²cal is less than table value χ ²(r-1)(c-1)– Otherwise reject Ho

Page 17: Chi Squared Test

EXAMPLE 1• Two hundred randomly selected adults were asked

whether TV shows as a whole are primarily entertaining , educational or a waste of time (only one answer to be chosen). The respondents were categorized as gender. Opinions are as

Gender Entertaining Educational Waste of time

Total

Female 52 28 30 110

Male 28 12 50 90

Total 80 40 80 200

Opinion

Page 18: Chi Squared Test

EXAMPLE 1

• Is this convincing that there is a relationship between gender and opinion in the population interest

• The critical value of χ ² =5.99at α =0.05 and df=2

Page 19: Chi Squared Test

EXAMPLE 1 -Solution

• Let us assume the null hypothesis that the opinion of adults is independent of gender

• The contingency table is of size 2x3, the degree of freedom is (2-1)(3-1)=2. Therefore we would have to calculate only two expected frequencies and other four can be automatically determined

Page 20: Chi Squared Test

EXAMPLE 1 -Solution

E11 = row 1 total x column 1 total /grand total = 110 x 80/200 = 44 E22 = row1 total x column 2 total/grand total =110 x 40/200 = 22E13 = 110-(44+22) = 44E21 = 80-E11 = 80-44 = 36E22 = 40-E12 = 40-22 =18E23 = 80-E13 = 80-44 = 36

Page 21: Chi Squared Test

EXAMPLE 1 -Solution

The contingency table of expected frequencies is as follows

Gender Entertaining Educational Waste of time

Total

Female 52 28 30 110Male 28 12 50 90Total 80 40 80 200

Page 22: Chi Squared Test

EXAMPLE 1 -Solution

Observed (O)

Expected(E) (O-E) (O-E)² (O-E)²/E

522830281250

Page 23: Chi Squared Test

EXAMPLE 1 -Solution

Observed (O)

Expected(E) (O-E) (O-E)² (O-E)²/E

52 4428 2230 4428 3612 1850 36

Page 24: Chi Squared Test

EXAMPLE 1 -Solution

Observed (O)

Expected(E) (O-E) (O-E)² (O-E)²/E

52 44 828 22 630 44 1428 36 -812 18 -650 36 14

Page 25: Chi Squared Test

EXAMPLE 1 -Solution

Observed (O)

Expected(E) (O-E) (O-E)² (O-E)²/E

52 44 8 6428 22 6 3630 44 14 19628 36 -8 6412 18 -6 3650 36 14 196

Page 26: Chi Squared Test

EXAMPLE 1 -Solution

Observed (O)

Expected(E) (O-E) (O-E)² (O-E)²/E

52 44 8 64 1.45428 22 6 36 1.63630 44 14 196 4.45528 36 -8 64 1.77712 18 -6 36 2.00050 36 14 196 5.444

16.766

Page 27: Chi Squared Test

EXAMPLE 1 -Solution

The critical value of χ ² =5.99 at α =0.05 and df=2. Since the calculated value of χ ² =16.777 is more than its critical value , the null hypothesis is rejected. Hence we conclude that the opinion of adults is not independent of gender