Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis...

24
Chi-Square Analysis Statistical analysis of two qualitative random variables 28.1 Lecture 28 Chi-Square Analysis STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University

Transcript of Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis...

Page 1: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.1

Lecture 28Chi-Square AnalysisSTAT 225 Introduction to Probability ModelsApril 23, 2014

Whitney HuangPurdue University

Page 2: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.2

χ2 test for two qualitative random variables

For a given contingency table, we want to test if twovariables have a relationship or notTo answer this question, we need to perform a statisticalhypothesis testA statistical hypothesis test is a method of makingdecisions using data from a scientific studyA χ2 test is suitable for answering whether two qualitativevariables have a relationship or not

Page 3: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.2

χ2 test for two qualitative random variables

For a given contingency table, we want to test if twovariables have a relationship or notTo answer this question, we need to perform a statisticalhypothesis testA statistical hypothesis test is a method of makingdecisions using data from a scientific studyA χ2 test is suitable for answering whether two qualitativevariables have a relationship or not

Page 4: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.2

χ2 test for two qualitative random variables

For a given contingency table, we want to test if twovariables have a relationship or notTo answer this question, we need to perform a statisticalhypothesis testA statistical hypothesis test is a method of makingdecisions using data from a scientific studyA χ2 test is suitable for answering whether two qualitativevariables have a relationship or not

Page 5: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.2

χ2 test for two qualitative random variables

For a given contingency table, we want to test if twovariables have a relationship or notTo answer this question, we need to perform a statisticalhypothesis testA statistical hypothesis test is a method of makingdecisions using data from a scientific studyA χ2 test is suitable for answering whether two qualitativevariables have a relationship or not

Page 6: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.3

χ2 test

The procedure of χ2 test for two qualitative random variables:1 Define the Null (H0) and Alternative (HA) hypotheses

H0 : there is no relationship between the 2 variablesHA : there is a relationship between the 2 variables

2 (If necessary) Calculate the marginal totals, and the grandtotal

3 Calculate the expected cell countsexpected cell count = Row Total×Column Total

Grand Total4 Calculate the partial χ2 values (a χ2 value for each cell of

the table)partial χ2 value = (observed - expected)2

expected

Page 7: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.3

χ2 test

The procedure of χ2 test for two qualitative random variables:1 Define the Null (H0) and Alternative (HA) hypotheses

H0 : there is no relationship between the 2 variablesHA : there is a relationship between the 2 variables

2 (If necessary) Calculate the marginal totals, and the grandtotal

3 Calculate the expected cell countsexpected cell count = Row Total×Column Total

Grand Total4 Calculate the partial χ2 values (a χ2 value for each cell of

the table)partial χ2 value = (observed - expected)2

expected

Page 8: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.3

χ2 test

The procedure of χ2 test for two qualitative random variables:1 Define the Null (H0) and Alternative (HA) hypotheses

H0 : there is no relationship between the 2 variablesHA : there is a relationship between the 2 variables

2 (If necessary) Calculate the marginal totals, and the grandtotal

3 Calculate the expected cell countsexpected cell count = Row Total×Column Total

Grand Total4 Calculate the partial χ2 values (a χ2 value for each cell of

the table)partial χ2 value = (observed - expected)2

expected

Page 9: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.3

χ2 test

The procedure of χ2 test for two qualitative random variables:1 Define the Null (H0) and Alternative (HA) hypotheses

H0 : there is no relationship between the 2 variablesHA : there is a relationship between the 2 variables

2 (If necessary) Calculate the marginal totals, and the grandtotal

3 Calculate the expected cell countsexpected cell count = Row Total×Column Total

Grand Total4 Calculate the partial χ2 values (a χ2 value for each cell of

the table)partial χ2 value = (observed - expected)2

expected

Page 10: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.4

χ2 test cont’d

5 Calculate the χ2 statisticχ2 =

∑partial χ2 value

6 Calculate the degrees of freedom (df )df = (#of rows− 1)× (#of columns− 1)

7 Find the χ2 critical value with respect to α from the χ2 table8 Draw your conclusion:

Reject H0 if your χ2 statistic is bigger than the χ2 criticalvalue⇒ There is an statistical evidence that there is arelationship between the 2 variables at α level

Page 11: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.4

χ2 test cont’d

5 Calculate the χ2 statisticχ2 =

∑partial χ2 value

6 Calculate the degrees of freedom (df )df = (#of rows− 1)× (#of columns− 1)

7 Find the χ2 critical value with respect to α from the χ2 table8 Draw your conclusion:

Reject H0 if your χ2 statistic is bigger than the χ2 criticalvalue⇒ There is an statistical evidence that there is arelationship between the 2 variables at α level

Page 12: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.4

χ2 test cont’d

5 Calculate the χ2 statisticχ2 =

∑partial χ2 value

6 Calculate the degrees of freedom (df )df = (#of rows− 1)× (#of columns− 1)

7 Find the χ2 critical value with respect to α from the χ2 table8 Draw your conclusion:

Reject H0 if your χ2 statistic is bigger than the χ2 criticalvalue⇒ There is an statistical evidence that there is arelationship between the 2 variables at α level

Page 13: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.4

χ2 test cont’d

5 Calculate the χ2 statisticχ2 =

∑partial χ2 value

6 Calculate the degrees of freedom (df )df = (#of rows− 1)× (#of columns− 1)

7 Find the χ2 critical value with respect to α from the χ2 table8 Draw your conclusion:

Reject H0 if your χ2 statistic is bigger than the χ2 criticalvalue⇒ There is an statistical evidence that there is arelationship between the 2 variables at α level

Page 14: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.5

Example 67

A 2011 study was conducted in Kalamazoo, Michigan. Theobjective was to determine if parents’ marital status affectschildren’s marital status later in their life. In total, 2,000 childrenwere interviewed. The columns refer to the parents’ maritalstatus. Use the contingency table below to conduct a χ2 testfrom beginning to end. Use α = .10

(Observed) Married Divorced TotalMarried 581 487Divorced 455 477

Total

Page 15: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.6

Example 67 cont’d

1 Define the Null and Alternative hypotheses:H0 : there is no relationship between parents’ maritalstatus and childrens’ marital statusHA : there is a relationship between parents’ marital statusand childrens’ marital status

2 Calculate the marginal totals, and the grand total

(Observed) Married Divorced TotalMarried 581 487 1068Divorced 455 477 932

Total 1036 964 2000

Page 16: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.6

Example 67 cont’d

1 Define the Null and Alternative hypotheses:H0 : there is no relationship between parents’ maritalstatus and childrens’ marital statusHA : there is a relationship between parents’ marital statusand childrens’ marital status

2 Calculate the marginal totals, and the grand total

(Observed) Married Divorced TotalMarried 581 487 1068Divorced 455 477 932

Total 1036 964 2000

Page 17: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.7

Example 67 cont’d

3 Calculate the expected cell counts

(Expected) Married DivorcedMarried 1068×1036

2000 = 553.224 1068×9642000 = 514.776

Divorced 932×10362000 = 482.776 932×964

2000 = 449.224

4 Calculate the partial χ2 values

partial χ2 Married DivorcedMarried (581−553.224)2

553.224 = 1.39 (487−514.776)2

514.776 = 1.50Divorced (455−482.776)2

482.776 = 1.60 (477−449.224)2

449.224 = 1.72

Page 18: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.7

Example 67 cont’d

3 Calculate the expected cell counts

(Expected) Married DivorcedMarried 1068×1036

2000 = 553.224 1068×9642000 = 514.776

Divorced 932×10362000 = 482.776 932×964

2000 = 449.224

4 Calculate the partial χ2 values

partial χ2 Married DivorcedMarried (581−553.224)2

553.224 = 1.39 (487−514.776)2

514.776 = 1.50Divorced (455−482.776)2

482.776 = 1.60 (477−449.224)2

449.224 = 1.72

Page 19: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.8

Example 67 cont’d

5 Calculate the χ2 statisticχ2 = 1.39 + 1.50 + 1.60 + 1.72 = 6.21

6 Calculate the degrees of freedom (df )The df is (2− 1)× (2− 1) = 1

7 Find the χ2 critical value with respect to α from the χ2 tableThe χ2

α=0.1,df=1 = 2.718 Draw your conclusion:

We reject H0 and conclude that there is a relationshipbetween parents’ marital status and childrens’ maritalstatus.

Page 20: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.8

Example 67 cont’d

5 Calculate the χ2 statisticχ2 = 1.39 + 1.50 + 1.60 + 1.72 = 6.21

6 Calculate the degrees of freedom (df )The df is (2− 1)× (2− 1) = 1

7 Find the χ2 critical value with respect to α from the χ2 tableThe χ2

α=0.1,df=1 = 2.718 Draw your conclusion:

We reject H0 and conclude that there is a relationshipbetween parents’ marital status and childrens’ maritalstatus.

Page 21: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.8

Example 67 cont’d

5 Calculate the χ2 statisticχ2 = 1.39 + 1.50 + 1.60 + 1.72 = 6.21

6 Calculate the degrees of freedom (df )The df is (2− 1)× (2− 1) = 1

7 Find the χ2 critical value with respect to α from the χ2 tableThe χ2

α=0.1,df=1 = 2.718 Draw your conclusion:

We reject H0 and conclude that there is a relationshipbetween parents’ marital status and childrens’ maritalstatus.

Page 22: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.8

Example 67 cont’d

5 Calculate the χ2 statisticχ2 = 1.39 + 1.50 + 1.60 + 1.72 = 6.21

6 Calculate the degrees of freedom (df )The df is (2− 1)× (2− 1) = 1

7 Find the χ2 critical value with respect to α from the χ2 tableThe χ2

α=0.1,df=1 = 2.718 Draw your conclusion:

We reject H0 and conclude that there is a relationshipbetween parents’ marital status and childrens’ maritalstatus.

Page 23: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.9

Example 68

The following contingency table contains enrollment data for arandom sample of students from several colleges at PurdueUniversity during the 2006-2007 academic year. The table liststhe number of male and female students enrolled in eachcollege. Use the two-way table to conduct a χ2 test frombeginning to end. Use α = .01

(Observed) Female Male TotalLiberal Arts 378 262 640

Science 99 175 274Engineering 104 510 614

Total 581 947 1528

Page 24: Chi-Square Analysis Lecture 28 - Purdue Universityhuang251/slides28.pdf · Chi-Square Analysis Statistical analysis of two qualitative random variables 28.3 ˜2 test The procedure

Chi-Square Analysis

Statistical analysis oftwo qualitative randomvariables

28.10

Example 68 cont’d

(Expected) Female MaleLiberal Arts 640×581

1528 = 243.35 640×9471528 = 396.65

Science 274×5811528 = 104.18 274×947

1528 = 169.82Engineering 614×581

1528 = 233.46 614×9471528 = 380.54

partial χ2 Female MaleLib Arts (378−243.35)2

243.35 = 74.50 (262−396.65)2

396.65 = 45.71Sci (99−104.18)2

104.18 = 0.26 (175−169.82)2

169.82 = 0.16Eng (104−233.46)2

233.46 = 71.79 (510−380.54)2

380.54 = 44.05

χ2 = 74.50 + 45.71 + 0.26 + 0.16 + 71.79 + 44.05 = 236.47The df = (3− 1)× (2− 1) = 2The χ2

α=.01,df=2 = 9.21We reject H0 and conclude that there is a relationship betweengender and major.