Chi-square (χ2) - Michael Hoerger · Web viewThe chi-square test for independence was...
Click here to load reader
Transcript of Chi-square (χ2) - Michael Hoerger · Web viewThe chi-square test for independence was...
1
Chi-square (χ2)PSY 2114-15-08
LAST STATISTIC OF THE SEMESTER!NEXT TIME: FOOD DAY, REVIEW, WRAP-UP
A. Introduction
1. Parametric tests: Use at least one numeric rating, so scores can be placed in a frequency distribution that usually has a normal shape
Correlation/Regression: Continuous variables only
t-test/ANOVA: One categorical variable, one continuous variable
2. Non-parametric tests: Do not use numerical values, scores cannot be placed on a frequency distribution (also sometimes used for numeric variables that have non-normal distributions)
Chi-square (χ2): Categorical variables only
Pronunciation note: Chi is pronounced “khi” (from Greece), not “chai” (India), not “chee” (China)
B. Two Types of χ2
χ2 Test for Goodness of Fito Involves a single categorical variable only
χ2 Test for Independenceo Involves 2(+) categorical variables
2
C. χ2 Test for Goodness of Fit
Generally just involves one categorical variable Null hypothesis specifies the proportion of the
population in each category Determines how well sample data conform to
proportions set forth by the null hypothesis Test statistic (χ2) examines whether the
proportions in the sample reliably differ from the null hypothesis
Examples:
Use logic or past research to guide the null hypothesis
H0 for Gender:
Female 50%Male 50%
H0 for Negative Childhood Emotion:
Shame 25%Anger 25%Anxiety 25%Sadness 25%
Often an equal percentage (proportion) is chosen for each group
3
Alternatively, null hypothesis could be based on known proportions in a larger population
Examples:
H0 for Ethnicity:
White 75%non-White 25%
H0 for Vegetarianism:
No 97%Yes 3%
H0 for Religious Affiliation:
Christianity 84%Non-religious / Don’t care 10%Agnosticism 2%Atheism 1%Other 3%
χ2 used to examine whether the proportions in a sample reliably differ from those hypothesized
Like F…χ2 ranges from 0 to ∞Is small when the null hypothesis is likely true.Is large when the null hypothesis is rejected.
4
H0 for Gender:
Female 50%Male 50%
N = 279Proportion Female = 50% = .50Hypothesized frequency Female = .50 * 279 = 139.5Proportion Male = 50% = .50Hypothesized frequency Male = .50 * 279 = 139.5
Chi-Square TestFrequencies
15. Gender
213 139.5 73.566 139.5 -73.5
279
FemaleMaleTotal
Observed N Expected N Residual
Test Statistics
77.4521
.000
Chi-Square a
dfAsymp. Sig.
15. Gender
0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 139.5.
a.
The Observed N indicates the actual frequency in each group
The Expected N indicates the frequency that is hypothesized, based on the null hypothesis.
The Residual just indicates the Observed frequency minus the Expected frequency
The Test Statistics box indicates that the χ2 value is 44.17, the degrees of freedom (df) are 1, and p < .001.
The chi-square test for goodness of fit was significant, χ2(1, N = 279) = 77.45, p < .001. The sample included an unexpectedly high number of females.
5
Last Semester…
Chi-Square TestFrequencies
2. Gender
223 163.0 60.0103 163.0 -60.0326
FemaleMaleTotal
Observed N Expected N Residual
Test Statistics
44.1721
.000
Chi-Square a
dfAsymp. Sig.
2. Gender
0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 163.0.
a.
The chi-square test for goodness of fit was significant, χ2(1, N = 326) = 44.17, p < .001. The sample included an unexpectedly high number of females.
6
H0 for Religious Affiliation:
Christianity 84%Non-religious / Don’t care 10%Agnosticism 2%Atheism 1%Other 3%
N = 279Hypothesized frequency Christian = .84 * 279 = 234.4… … …Hypothesized frequency Other = .03 * 279 = 8.4
Chi-Square TestFrequencies
31. Worldview or Religion
204 234.4 -30.427 27.9 -.910 5.6 4.435 2.8 32.23 8.4 -5.4
279
ChristianityAgnosticismAtheismDon't Know / Don't CareOtherTotal
Observed N Expected N Residual
Test Statistics
382.7674
.000
Chi-Square a
dfAsymp. Sig.
31. Worldviewor Religion
1 cells (20.0%) have expected frequencies less than5. The minimum expected cell frequency is 2.8.
a.
The chi-square test for goodness of fit was significant, χ2(4, N = 279) = 382.77, p < .001. There were fewer Christians than expected and more people who were apathetic.
7
Last Semester…
Chi-Square TestFrequencies
23. Religion
42 32.6 9.4231 273.8 -42.819 6.5 12.516 3.3 12.718 9.8 8.2
326
Non-religiousChristianityAgnosticismAtheismOtherTotal
Observed N Expected N Residual
Test Statistics
89.9974
.000
Chi-Square a
dfAsymp. Sig.
23. Religion
1 cells (20.0%) have expected frequencies less than5. The minimum expected cell frequency is 3.3.
a.
The chi-square test for goodness of fit was significant, χ2(4, N = 326) = 90.00, p < .001. There were fewer Christians than expected and more people than expected in every other religious group.
Expected frequencies based on the expected proportions identified previously. For example, for Christianity .84 x 326 = 273.84.
8
H0 for Yearly Physical:
No 50%Yes 50%
Chi-Square TestFrequencies
2. Yearly Physical
143 139.5 3.5136 139.5 -3.5279
NoYesTotal
Observed N Expected N Residual
Test Statistics
.1761
.675
Chi-Square a
dfAsymp. Sig.
2. YearlyPhysical
0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 139.5.
a.
The chi-square test for goodness of fit was not significant, χ2(1, N = 279) = 0.18, p = .68. There was about an equal number of people getting yearly physicals as those who were not.
9
D. χ2 Test for Independence
Examines the relationship between two (or more) categorical variables to determine if they are independent
Two variables are said to be independent if there is no relationship between them
Two variables are said to be dependent if there is a relationship between them
Similar to the correlation coefficient, except that instead of both variables being continuous, both variables are categoricalo Can’t correlate Academic Major with
Favorite Barnyard Animal, but you can do a chi-square!
Null hypothesis: no relationship Alternative hypothesis: some relationship
10
Examples:
Does Gender related to Beliefs about Human Origins?
15. Gender * 18. Belief about Human Origins Crosstabulation
124 89 213117.6 95.4 213.0
30 36 6636.4 29.6 66.0154 125 279
154.0 125.0 279.0
CountExpected CountCountExpected CountCountExpected Count
Female
Male
15. Gender
Total
ReligiousTexts
EvolutionaryTheory
18. Belief about HumanOrigins
Total
Chi-Square Tests
3.318b 1 .0692.822 1 .0933.304 1 .069
.089 .047
3.306 1 .069
279
Pearson Chi-SquareContinuity Correctiona
Likelihood RatioFisher's Exact TestLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)Exact Sig.(2-sided)
Exact Sig.(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is 29.57.
b.
Observed frequencies compared to those that are expected, based on the null hypothesisSample sizeChi-square value, degrees of freedom, and p-value
11
MaleFemale
15. Gender
125
100
75
50
25
0
Cou
nt
Evolutionary TheoryReligious Texts
18. Belief about HumanOrigins
Bar Chart
The chi-square test for independence was non-significant, χ2(1, N = 279) = 3.32, p = .07. Gender was not reliably related to beliefs about human origins.
12
Does being a parent impact one’s political views?13. Parent * 20. Top Political Issue Crosstabulation
76 58 34 11 46 22568.5 65.3 39.5 10.5 41.1 225.0
9 23 15 2 5 5416.5 15.7 9.5 2.5 9.9 54.0
85 81 49 13 51 27985.0 81.0 49.0 13.0 51.0 279.0
CountExpected CountCountExpected CountCountExpected Count
No
Yes
13. Parent
Total
Education Economy Healthcare Foreign Policy Other20. Top Political Issue
Total
Chi-Square Tests
15.516a 4 .00415.831 4 .003
.080 1 .777
279
Pearson Chi-SquareLikelihood RatioLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)
1 cells (10.0%) have expected count less than 5. Theminimum expected count is 2.52.
a.
13
YesNo
13. Parent
80
60
40
20
0
Cou
nt
OtherForeign PolicyHealthcareEconomyEducation
20. Top Political Issue
Bar Chart
The chi-square test for independence was significant, χ2(4, N = 279) = 15.52, p = .004. Non-parents were mainly concerned with education, whereas parents were mainly concerned with the economy.
14
Does ethnicity relate to musical preference?
11. White * 32. Favorite Music Crosstabulation
2 2 12 10 266.0 8.8 3.6 7.6 26.062 92 27 72 253
58.0 85.2 35.4 74.4 253.064 94 39 82 279
64.0 94.0 39.0 82.0 279.0
CountExpected CountCountExpected CountCountExpected Count
No
Yes
11. White
Total
Country RockRap, hip
hop, R&B Other
32. Favorite Music
Total
Chi-Square Tests
30.695a 3 .00026.787 3 .000
9.472 1 .002
279
Pearson Chi-SquareLikelihood RatioLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)
1 cells (12.5%) have expected count less than 5. Theminimum expected count is 3.63.
a.
15
YesNo
11. White
100
80
60
40
20
0
Cou
nt
OtherRap, hip hop, R&BRockCountry
32. Favorite Music
Bar Chart
The chi-square test for independence was significant, χ2(3, N = 279) = 30.70, p < .001. Compared to other ethnic groups, white people were less likely to prefer rap, hip hop, and R&B.
16
Some Examples from Last Semester…
Is Gender related to Vegetarianism?
2. Gender * 10. Vegetarian Crosstabulation
206 17 223209.3 13.7 223.0
100 3 10396.7 6.3 103.0306 20 326
306.0 20.0 326.0
CountExpected CountCountExpected CountCountExpected Count
Female
Male
2. Gender
Total
No Yes10. Vegetarian
Total
Chi-Square Tests
2.715b 1 .0991.959 1 .1623.081 1 .079
.136 .076
2.707 1 .100
326
Pearson Chi-SquareContinuity Correctiona
Likelihood RatioFisher's Exact TestLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)Exact Sig.(2-sided)
Exact Sig.(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is 6.32.
b.
Observed frequencies compared to those that are expected, based on the null hypothesisSample sizeChi-square value, degrees of freedom, and p-value
The chi-square test for independence was non-significant, χ2(1, N = 326) = 2.72, ns. Gender was not reliably related to vegetarianism.
The sample is 68% female, so females are expected to make up 68% of the vegetarians and 68% of the non-vegetarians.
17
Does ethnicity relate to musical preference?
20. Ethnicity * 29. Music Crosstabulation
16 12 25 82 40 66 51 292
16.1 13.4 25.1 85.1 37.6 61.8 52.8 292.0
2 3 3 13 2 3 8 34
1.9 1.6 2.9 9.9 4.4 7.2 6.2 34.0
18 15 28 95 42 69 59 326
18.0 15.0 28.0 95.0 42.0 69.0 59.0 326.0
CountExpectedCountCountExpectedCountCountExpectedCount
White
Non-White
20.Ethnicity
Total
Rap R&BHipHop
Alternative
ClassicRock Country Other
29. Music
Total
Chi-Square Tests
7.354a 6 .2897.962 6 .241
.853 1 .356
326
Pearson Chi-SquareLikelihood RatioLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)
4 cells (28.6%) have expected count less than 5. Theminimum expected count is 1.56.
a.
The chi-square test for independence was non-significant, χ2(6, N = 326) = 7.35, ns. Ethnicity was not reliably related to music preference.
White people make up 90% of the sample, so they are expected to make up 90% of those who like rap, 90% of those who like R&B, etc.
18
Does being an athlete related to choice of hero?
4. Athlete * 22. Hero Crosstabulation
58 20 9 17 10 8 3 3 20 148
53.6 27.7 8.6 18.2 10.9 4.5 3.6 6.4 14.5 148.0
60 41 10 23 14 2 5 11 12 178
64.4 33.3 10.4 21.8 13.1 5.5 4.4 7.6 17.5 178.0
118 61 19 40 24 10 8 14 32 326
118.0 61.0 19.0 40.0 24.0 10.0 8.0 14.0 32.0 326.0
CountExpectedCountCountExpectedCountCountExpectedCount
No
Yes
4.Athlete
Total
Mom Dad SiblingOther
Relative FriendRomanticPartner
FamousPerson
Teacher/Coach Other
22. Hero
Total
Chi-Square Tests
16.937a 8 .03117.516 8 .025
.671 1 .413
326
Pearson Chi-SquareLikelihood RatioLinear-by-LinearAssociationN of Valid Cases
Value dfAsymp. Sig.
(2-sided)
3 cells (16.7%) have expected count less than 5. Theminimum expected count is 3.63.
a.
The chi-square test for independence was statistically significant, χ2(8, N = 326) = 16.94, p = .03. Specifically, athletes were more likely than expected to indicate that their dad was their hero.