# Chapter 16: Chi Square

Chapter 16: Chi SquareChapter 16: Chi Square

OverviewOverview

z, t, ANOVA, regression, & correlation z, t, ANOVA, regression, & correlation have have – Used at least one continuous variableUsed at least one continuous variable– Relied on underlying population parametersRelied on underlying population parameters– Been based on particular distributionsBeen based on particular distributions

Chi square (Chi square (χχ22)) is is– Based on categorical variablesBased on categorical variables– Non-parametricNon-parametric– Distribution-freeDistribution-free

Categorical VariablesCategorical Variables

Generally the count of objects falling in Generally the count of objects falling in each of several categories.each of several categories.

Examples:Examples:– number of fraternity, sorority, and nonaffiliated number of fraternity, sorority, and nonaffiliated

members of a classmembers of a class– number of students choosing answers: 1, 2, 3, number of students choosing answers: 1, 2, 3,

4, or 54, or 5

Emphasis on frequency in each categoryEmphasis on frequency in each category

Contingency TablesContingency Tables

Two independent variablesTwo independent variables– Can be various levels similar to two-way Can be various levels similar to two-way

ANOVAANOVA– Gender identity, level of happinessGender identity, level of happiness

Intimacy and DepressionIntimacy and Depression

Everitt & Smith (1979)Everitt & Smith (1979)

Asked depressed and non-depressed Asked depressed and non-depressed women about intimacy with women about intimacy with boyfriend/husbandboyfriend/husband

Data on next slideData on next slide

DataData

Chapter 16 Chi-SquareChapter 16 Chi-Square 77

What Do the Data Say?What Do the Data Say?

It It lookslooks as if depressed women are more as if depressed women are more likely to report lack of intimacy.likely to report lack of intimacy.

What alternative explanations?What alternative explanations?

Is the relationship reliably different from Is the relationship reliably different from chance?chance?– Chi-square testChi-square test

Chi-Square on Contingency Chi-Square on Contingency TableTable

The formulaThe formula

Expected frequenciesExpected frequenciesE = E = RT X CTRT X CT GT GT

RTRT = Row total, = Row total, CTCT = Column total, = Column total, GTGT = Grand total = Grand total

EEO 2

2 )(

Expected FrequenciesExpected Frequencies

EE1111 = (37*138)/419 = 12.19 = (37*138)/419 = 12.19

EE1212 = (37*281)/419 = 24.81 = (37*281)/419 = 24.81

EE2121 = (382*138)/419 = 125.81 = (382*138)/419 = 125.81

EE2222 = (382*281)/419 = 256.19 = (382*281)/419 = 256.19

Enter on following table Enter on following table

Observed and Expected Freq.Observed and Expected Freq.

Degrees of FreedomDegrees of Freedom

For contingency table, For contingency table, dfdf = ( = (RR - 1)( - 1)(CC - 1) - 1)

For our example this is (2 - 1)(2 - 1) = 1For our example this is (2 - 1)(2 - 1) = 1– Note that knowing any Note that knowing any oneone cell and the cell and the

marginal totals, you could reconstruct all other marginal totals, you could reconstruct all other cells.cells.

Chi-Square CalculationChi-Square Calculation

61.25

19.25619.256270

81.12581.125112

81.24)81.2411(

19.12)19.1226()(

22

2222

EEO

84.3)1(2

05.

ConclusionsConclusions

Since 25.61 > 3.84, reject Since 25.61 > 3.84, reject HH00

Conclude that depression and intimacy are Conclude that depression and intimacy are not independent.not independent.– How one responds to “satisfaction with How one responds to “satisfaction with

intimacy” depends on whether they are intimacy” depends on whether they are depressed.depressed.

– Could be depression-->dissatisfaction, lack of Could be depression-->dissatisfaction, lack of intimacy --> depression, depressed people intimacy --> depression, depressed people see world as not meeting needs, etc.see world as not meeting needs, etc.

Larger Contingency TablesLarger Contingency Tables

Is addiction linked to childhood Is addiction linked to childhood experimentation?experimentation?

Do adults who are, and are not, addicted to Do adults who are, and are not, addicted to substances (alcohol or drug) differ in childhood substances (alcohol or drug) differ in childhood categories of drug experimentation?categories of drug experimentation?

One variable = adult addictionOne variable = adult addiction– yes or noyes or no

Other variable = number of experimentation Other variable = number of experimentation categories (out of 4) as childrencategories (out of 4) as children

– Tobacco, alcohol, marijuana/hashish, or Tobacco, alcohol, marijuana/hashish, or acid/cocaine/other acid/cocaine/other

Adult Addiction

No Yes Total 0 512

(494.49) 54 (71.51)

566

1 227 (230.65)

37 (33.35)

264

2 59 (64.65)

15 (9.35)

74

Number Childhood Experiment Categories

3-4 18 (26.21)

12 (3.79)

30

Total 816 118 934

Chi-Square CalculationChi-Square Calculation

62.29

79.379.312

21.2621.2618

...51.71

)51.7154(49.494

)49.494512()(

22

2222

EEO

82.7)3(2

05.

ConclusionsConclusions

29.62 > 7.8229.62 > 7.82– Reject Reject HH00

– Conclude that adult addiction is related to Conclude that adult addiction is related to childhood experimentationchildhood experimentation

– Increasing levels of childhood Increasing levels of childhood experimentation are associated with greater experimentation are associated with greater levels of adult addiction.levels of adult addiction.

e.g. Approximately 10% of children not e.g. Approximately 10% of children not experimenting later become addicted as adults.experimenting later become addicted as adults.

Conclusions--cont.Conclusions--cont.

Approximately 40% of highly experimenting Approximately 40% of highly experimenting children are later addicted as adults.children are later addicted as adults.

These data suggest that childhood These data suggest that childhood experimentation may lead to adult experimentation may lead to adult addiction.addiction.

Tests on ProportionsTests on Proportions

Proportions can be converted to Proportions can be converted to frequencies, and tested using frequencies, and tested using 22..

Use a Use a zz test directly on the proportions if test directly on the proportions if you have two proportionsyou have two proportions

From last exampleFrom last example– 10% of nonabused children abused as adults10% of nonabused children abused as adults– 40% of abused children abused as adults40% of abused children abused as adults

Proportions--cont.Proportions--cont.

There were 566 nonabused children There were 566 nonabused children and 30 heavily abused children.and 30 heavily abused children.

17.5059.305.

0035.305.

301

5661

)111.1(111.

40.095.

11)1(

21

21

NNPP

PPz

111.30566

40.*30095.*566

21

2211

NNPNP

P

Proportions--cont.Proportions--cont.

zz = 5.17 = 5.17

This is a standard This is a standard zz score. score.– Therefore .05 (2-tailed) cutoff = Therefore .05 (2-tailed) cutoff = ++1.961.96– Reject null hypothesis that the population Reject null hypothesis that the population

proportions of abuse in both groups are equal.proportions of abuse in both groups are equal.

This is just the square root of the This is just the square root of the 22 you you would have with would have with 22 on those 4 cells. on those 4 cells.

Independent ObservationsIndependent Observations

We require that observations be We require that observations be independent.independent.– Only one score from each respondentOnly one score from each respondent– Sum of frequencies must equal number of Sum of frequencies must equal number of

respondentsrespondents

If we don’t have independence of If we don’t have independence of observations, test is not valid.observations, test is not valid.

Small Expected FrequenciesSmall Expected Frequencies

Assume Assume OO would be normally distributed would be normally distributed around around EE over many replications of over many replications of experiment.experiment.

This could not happen if This could not happen if EE is small. is small.

Rule of thumb: Rule of thumb: EE >> 5 in each cell 5 in each cell– Not firm ruleNot firm rule– Violated in earlier example, but probably not a Violated in earlier example, but probably not a

problemproblem

Expected Frequencies--cont.Expected Frequencies--cont.

More of a problem in tables with few cells.More of a problem in tables with few cells.

Never have expected frequency of 0.Never have expected frequency of 0.

Collapse adjacent cells if necessary.Collapse adjacent cells if necessary.