Chi squared test

of 25 /25
KARL PEARSON (1857-1936) British mathematician, ‘father’ of modern statistics and a pioneer of eugenics! (Pearson’s)

Embed Size (px)

Transcript of Chi squared test

Page 1: Chi squared test

KARL PEARSON(1857-1936)

British mathematician, ‘father’ of modern statistics and a pioneer of eugenics!

(Pearson’s)

Page 2: Chi squared test

Chi-squared (χ2) test

• This test compares measurements relating to the frequency of individuals in defined categories e.g. the numbers of white and purple flowers in a population of pea plants.

• Chi-squared is used to test if the observed frequency fits the frequency you expected or predicted.

Page 3: Chi squared test

How do we calculate the expected frequency?• You might expect the observed frequency of

your data to match a specific ratio. e.g. a 3:1 ratio of phenotypes in a genetic cross.

• Or you may predict a homogenous distribution of individuals in an environment. e.g. numbers of daisies counted in quadrats on a field.

Note: In some cases you might expect the observed frequencies to match the expected, in others you might hope for a difference between them.

Page 4: Chi squared test

Example 1: GENETICS

Comparing the observed frequency of different types of maize grains with the expected ratio calculated using a Punnett square.

Page 5: Chi squared test

The photo shows four different phenotypes for maize grain, as follows:

Purple & Smooth (A), Purple & Shrunken (B), Yellow & Smooth (C) and Yellow & Shrunken (D)

Page 6: Chi squared test

Gametes PS Ps pS ps

PS PPSS PPSs PpSS PpSs

Ps PPSs PPss PpSs Ppss

pS PpSS PpSs ppSS ppSs

ps PpSs Ppss ppSs ppss

The Punnett square below shows the expected ratio of phenotypes from crosses of four genotypes of maize.

A : B : C : D = 9 : 3 : 3 : 1

Page 7: Chi squared test

H0 = there is no statistically significant difference between the observed frequency of maize grains and the expected frequency (the 9:3:3:1 ratio)

HA = there is a significant difference between the observed frequency of maize grains and the expected frequency

If the value for χ2 exceeds the critical value (P = 0.05), then you can reject the null hypothesis.

What is the null hypothesis (H0)?

Page 8: Chi squared test

Calculating χ2

χ2 = (O – E)2

E

O = the observed resultsE = the expected (or predicted) results

Page 9: Chi squared test

Phenotype O E(9:3:3:1)

O-E (O-E)2 (O-E)2

E

A 271 244 27 729 2.99

B 73 81 -8 64 0.88

C 63 81 -18 324 4.00

D 26 27 -1 1 0.04

433 433 χ2= 7.91

Page 10: Chi squared test

Compare your calculated value of χ2 with the critical value in your stats table

Our value of χ2 = 7.91Degrees of freedom = no. of categories - 1 = 3

D.F. Critical Value (P = 0.05)

1 3.842 5.993 7.824 9.495 11.07

Our value for χ2 exceeds the critical value, so we can reject the null hypothesis.

There is a significant difference between our expected and observed ratios. i.e. they are a poor fit.

Page 11: Chi squared test

Example 2: ECOLOGY

• One section of a river was trawled and four species of fish counted and frequencies recorded.

• The expected frequency is equal numbers of the four fish species to be present in the sample.

Page 12: Chi squared test

H0 = there is no statistically significant difference between the observed frequency of fish species and the expected frequency.

HA = there is a significant difference between the observed frequency of fish and the expected frequency

If the value for χ2 exceeds the critical value (P = 0.05), then you can reject the null hypothesis.

What is the null hypothesis (H0)?

Page 13: Chi squared test

Calculating χ2

χ2 = (O – E)2

E

O = the observed resultsE = the expected (or predicted) results

Page 14: Chi squared test

Species O E O-E (O-E)2 (O-E)2

E

Rudd 15 10 5 25 2.5

Roach 15 10 5 25 2.5

Dace 4 10 -6 36 3.6

Bream 6 10 -4 16 1.6

40 40 χ2= 10.2

Page 15: Chi squared test

Compare your calculated value of χ2 with the critical value in your table of critical values.

Our value of χ2 = 10.2Degrees of freedom = no. of categories - 1 = 3

D.F. Critical Value (P = 0.05)

1 3.842 5.993 7.824 9.495 11.07

Our value for χ2 exceeds the critical value, so we can reject the null hypothesis.

There is a significant difference between our expected and observed frequencies of fish species.

Page 16: Chi squared test

Example 3: ECOLOGY

• Do 2 plant species A and B grow independently of one another?

• Quadrats taken to see if each plant species is present or absent

• The expected frequency is equal numbers of the two species to be present in the sample.

Page 17: Chi squared test

Observed valuesSpecies A

Present Absent Totals

Specis BPresent 111 9 120

Absent 71 43 114

182 52 234

Page 18: Chi squared test

Expected ValuesSpecies A

Present Absent Totals

Specis BPresent 182/234*120 52/234*120 120

Absent 182/234*114 52/234*114 114

182 52 234

Page 19: Chi squared test

So…

• Chi 2 = (Observed – Expected)2

» Expected

Page 20: Chi squared test

• Null hypothesis:

• If the plants grow independently of each other there should be no statistically significant difference in the number of species A seen when B is present as when it is absent! And vice versa

Page 21: Chi squared test

Example 4: CONTINGENCY TABLES

You can use contingency tables to calculate expected frequencies when the relationship between two quantities is being investigated.

In this example we will look at the incidence of colour blindness in both males and females.

Page 22: Chi squared test

H0 = there is no statistically significant difference between the observed frequency of colour blindness in males and females.

HA = there is a significant difference between the between the observed frequency of colour blindness in males and females

If the value for χ2 exceeds the critical value (P = 0.05), then you can reject the null hypothesis.

What is the null hypothesis (H0)?

Page 23: Chi squared test

Observed frequencies Males Females

Colour blind 56 14

Not colour blind 754 536

e.g.The expected frequency for colour blind males =

(56 + 14) x (56 + 754)

1360= 42

Expected Cell Frequency = (Row Total x Column Total)

n

Page 24: Chi squared test

Observed: Males Females

•Colour blind 56 14•Not colour blind 754 536

Expected: Males Females

•Colour blind 42 28

•Not colour blind 768 522

Males Females

•Colour blind 4.7 14•Not colour blind 754 536

χ2 =… (O – E)2

E = 4.7 + 14 + 754 + 536 = 12.33

(O – E)2 / E

Page 25: Chi squared test

Compare your calculated value of χ2 with the critical value in your table of critical values

Our value of χ2 = 12.33Deg of Freedom = (2 rows - 1) x (2 cols – 1) = 1

D.F. Critical Value (P = 0.05)

1 3.842 5.993 7.824 9.495 11.07

Our value for χ2 exceeds the critical value, so we can reject the null hypothesis.

There is a significant difference between our expected and observed frequencies.

The fraction of males with colour blindness is greater than that in females. The difference cannot be attributed to chance alone.