Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4...

55
Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4, 25.8 Lecture 45 Robb T. Koether Hampden-Sydney College Mon, Apr 11, 2016 Robb T. Koether (Hampden-Sydney College) Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8 Mon, Apr 11, 2016 1 / 28

Transcript of Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4...

Page 1: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8

Lecture 45

Robb T. Koether

Hampden-Sydney College

Mon, Apr 11, 2016

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 1 / 28

Page 2: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 2 / 28

Page 3: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 3 / 28

Page 4: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two Categorical Variables

Given two categorical variables, such as sex and politicalaffiliation, we may wonder whether they are related.If they are not related, then we say that they are independent.If sex and political affiliation are independent, then we should seethe same split between Republican, Democrat, and Independentamong men as we see among women.That is, the proportions should be equal.Likewise, we should see the same male/female split whether weare looking at Republicans, Democrats, or Independents.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 4 / 28

Page 5: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind

Total

Male 108 92 200

400

Female 112 218 270

600Total 220 310 470 1000

We have the row totals.We have the column totals.We have the grand total.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 5 / 28

Page 6: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400Female 112 218 270 600

Total 220 310 470 1000

We have the row totals.

We have the column totals.We have the grand total.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 5 / 28

Page 7: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400Female 112 218 270 600Total 220 310 470

1000

We have the row totals.We have the column totals.

We have the grand total.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 5 / 28

Page 8: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400Female 112 218 270 600Total 220 310 470 1000

We have the row totals.We have the column totals.We have the grand total.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 5 / 28

Page 9: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 6 / 28

Page 10: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Expected Counts

We need to compare the observed counts to the expected counts.Consider the first column, the Republicans.

There were 220 Republicans.Overall, the sample was 40% male and 60% female.Assuming independence, we would expect 40% of the Republicansto be male and 60% to be female.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 7 / 28

Page 11: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Expected Counts

So the expected count of Republican males is

E = 40% of 220

=

(400

1000

)× 220

=400× 220

1000

=Row total× Column total

Grand total

.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 8 / 28

Page 12: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Expected Counts

So the expected count of Republican males is

E = 40% of 220

=

(400

1000

)× 220

=400× 220

1000

=Row total× Column total

Grand total

.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 8 / 28

Page 13: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Expected Counts

So the expected count of Republican males is

E = 40% of 220

=

(400

1000

)× 220

=400× 220

1000

=Row total× Column total

Grand total

.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 8 / 28

Page 14: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Expected Counts

So the expected count of Republican males is

E = 40% of 220

=

(400

1000

)× 220

=400× 220

1000

=Row total× Column total

Grand total.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 8 / 28

Page 15: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88 124 188

Female 112 218 270 600

132 186 282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 16: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88

124 188

Female 112 218 270 600

132 186 282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 17: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88

124 188

Female 112 218 270 600132

186 282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 18: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88 124

188

Female 112 218 270 600132

186 282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 19: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88 124

188

Female 112 218 270 600132 186

282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 20: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88 124 188Female 112 218 270 600

132 186

282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 21: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Two-Way Tables

Suppose we survey 1000 individuals and note their sex and theparty affiliation (Rep, Dem, Ind).We may display the results in a two-way table.

Rep Dem Ind TotalMale 108 92 200 400

88 124 188Female 112 218 270 600

132 186 282Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 9 / 28

Page 22: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 10 / 28

Page 23: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Statistic

To measure how close the observed counts (O) are to theexpected counts (E), we compute the fraction

(O − E)2

E

for each cell in the table.The chi-square statistic χ2 is the sum of these fractions:

χ2 =∑

all cells

(O − E)2

E.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 11 / 28

Page 24: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

The distribution of the χ2 statistic is not symmetric.Rather, it is skewed right.It also has a different shape for each table size.Thus, we must specify the number of degrees of freedom.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 12 / 28

Page 25: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

0.5 1.0 1.5 2.0 2.5 3.0

0.2

0.4

0.6

0.8

1 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 26: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

1 2 3 4 5 6

0.1

0.2

0.3

0.4

0.5

2 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 27: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

2 4 6 8

0.05

0.10

0.15

0.20

3 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 28: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

2 4 6 8 10 12

0.05

0.10

0.15

4 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 29: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

2 4 6 8 10 12 14

0.05

0.10

0.15

5 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 30: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15

0.02

0.04

0.06

0.08

0.10

0.12

6 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 31: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15 20

0.02

0.04

0.06

0.08

0.10

0.12

7 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 32: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15 20

0.02

0.04

0.06

0.08

0.10

8 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 33: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15 20 25

0.02

0.04

0.06

0.08

0.10

9 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 34: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15 20 25 30

0.02

0.04

0.06

0.08

10 degree of freedom

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 35: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

5 10 15 20 25 30

0.02

0.04

0.06

0.08

10 degree of freedom vs. N(10,√

20)

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 36: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

10 20 30 40 50 60

0.01

0.02

0.03

0.04

0.05

0.06

20 degree of freedom vs. N(20,√

40)

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 37: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

20 40 60 80

0.01

0.02

0.03

0.04

0.05

30 degree of freedom vs. N(30,√

60)

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 13 / 28

Page 38: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The Chi-Square Distribution

Characteristics of the χ2 distributions:The mean of χ2 equals the degrees of freedom df .The standard deviation of χ2 equals

√2df .

The shape is skewed right, but as df increases, the shapeapproaches the normal distribution N(df ,

√2df ).

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 14 / 28

Page 39: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Degrees of Freedom

How many degrees of freedom are there in a two-way table? Andwhy are they called “degrees of freedom?”Suppose we know the row and column totals, but not the counts.

Rep Dem Ind TotalMale 400Female 600Total 220 310 470 1000

How many count values can we fill in before the remaining countsare “forced?”

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 15 / 28

Page 40: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Degrees of Freedom

In a two-way table, the number of degrees of freedom is

df = (No. of rows− 1)× (No. of columns− 1).

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 16 / 28

Page 41: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)

Calculate χ2 for the following table.Rep Dem Ind Total

Male 108 92 200 40088 124 188

Female 112 218 270 600132 186 282

Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 17 / 28

Page 42: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)

χ2 =∑

all cells

(O − E)2

E

=(108− 88)2

88+

(92− 124)2

124+

(200− 188)2

188

+(112− 132)2

132+

(218− 186)2

186+

(270− 282)2

282= 4.545 + 8.258 + 0.766 + 3.030 + 5.505 + 0.511= 22.615.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 18 / 28

Page 43: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)

χ2 =∑

all cells

(O − E)2

E

=(108− 88)2

88+

(92− 124)2

124+

(200− 188)2

188

+(112− 132)2

132+

(218− 186)2

186+

(270− 282)2

282

= 4.545 + 8.258 + 0.766 + 3.030 + 5.505 + 0.511= 22.615.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 18 / 28

Page 44: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)

χ2 =∑

all cells

(O − E)2

E

=(108− 88)2

88+

(92− 124)2

124+

(200− 188)2

188

+(112− 132)2

132+

(218− 186)2

186+

(270− 282)2

282= 4.545 + 8.258 + 0.766 + 3.030 + 5.505 + 0.511

= 22.615.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 18 / 28

Page 45: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)

χ2 =∑

all cells

(O − E)2

E

=(108− 88)2

88+

(92− 124)2

124+

(200− 188)2

188

+(112− 132)2

132+

(218− 186)2

186+

(270− 282)2

282= 4.545 + 8.258 + 0.766 + 3.030 + 5.505 + 0.511= 22.615.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 18 / 28

Page 46: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 19 / 28

Page 47: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The chi2 Test

Our procedure will follow the same 6 steps as always.1. State the hypotheses.2. Give the value of α.3. Write the formula for the test statistic.4. Calculate the value of the test statistic.5. Calculate the p-value.6. Draw a conclusion.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 20 / 28

Page 48: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The chi2 Test

The null hypothesis says that there is no difference in thedistributions among the rows or among the columns.That is, the two variables are independent.

H0 : The variables are independent

The alternative hypothesis says the opposite.

Ha : The variables are not independent

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 21 / 28

Page 49: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

The chi2 Test

The test statistic is

χ2 =∑

all cells

(O − E)2

E.

The degrees of freedom is

df = (No. of rows− 1)× (No. of columns− 1).

To find the p-value of χ2, use the χ2cdf function on the TI-83.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 22 / 28

Page 50: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 23 / 28

Page 51: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)Test whether a person’s sex and a person’s political affiliation areindependent.

Rep Dem Ind TotalMale 108 92 200 400

88 124 188Female 112 218 270 600

132 186 282Total 220 310 470 1000

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 24 / 28

Page 52: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)(1)

H0 : The variables are independentHa : The variables are not independent

(2) Let α = 0.05.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 25 / 28

Page 53: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Computing the χ2 Statistic

Example (Computing χ2)(1) The test statistic is

χ2 =∑

all cells

(O − E)2

E.

(2) We calculate χ2 = 22.615.(3) The p-value is

p-value = χ2cdf(22.615,E99,2)

= 1.228× 10−5.

(4) Reject H0 and conclude that sex and political affiliation are notindependent.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 26 / 28

Page 54: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Outline

1 Two Categorical Variables

2 Expected Counts

3 The χ2 Statistic

4 The χ2 Test

5 Example

6 Assignment

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 27 / 28

Page 55: Two Categorical Variables Sections 25.1, 25.2, 25.3, 25.4 ...people.hsc.edu/faculty-staff/robbk/math121/lectures... · Outline 1 Two Categorical Variables 2 Expected Counts 3 The

Assignment

AssignmentRead Sections 25.1, 25.2, 25.3, 25.4, 25.8.Apply Your Knowledge: 1, 2, 3, 5, 6.Check Your Skills: 19, 20, 21, 24, 25.Exercises 30, 31, 32, 34, 35.

Robb T. Koether (Hampden-Sydney College)Two Categorical VariablesSections 25.1, 25.2, 25.3, 25.4, 25.8Mon, Apr 11, 2016 28 / 28