Lecture 21: Tests for Departures from Neutrality November 9, 2012.

29
Lecture 21: Tests for Departures from Neutrality November 9, 2012

Transcript of Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Page 1: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Lecture 21: Tests for Departures from Neutrality

November 9, 2012

Page 2: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Introduction to neutral theory

Molecular clock

Expectations for allele frequency distributions under neutral theory

Last Time

Page 3: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Today Sequence data and quantification of

variation

Infinite sites model

Nucleotide diversity (π)

Sequence-based tests of neutrality

Ewens-Watterson Test

Tajima’s D

Hudson-Kreitman-Aguade

Synonymous versus Nonsynonymous substitutions

McDonald-Kreitman

Page 4: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Expected Heterozygosity with Mutation-Drift Equilibrium under IAM

At equilibrium:

1

1

14

1

ee N

f

1

eH

Remembering that H = 1-f:

set 4Neμ = θ

Page 5: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Allele Frequency Distributions Neutral theory allows a

prediction of frequency distribution of alleles through process of birth and demise of alleles through time

Comparison of observed to expected distribution provides evidence of departure from Infinite Alleles model

Depends on f, effective population size, and mutation rate

Hartl and Clark 2007

Black: Predicted from Neutral Theory

White: Observed (hypothetical)

Page 6: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Ewens Sampling Formula

i

10

2

3

12

0

)(N

i ikE

3211)(

3

0

12

0

i

N

i iikE

.

Probability the i-th sampled allele is new given i alleles already sampled:

Probability of sampling a new allele on the first sample:

eH

1

Probability of observing a new allele after sampling one

allele:

Probability of sampling a new allele on the third and fourth samples:

12...

211

N

Expected number of different alleles (k) in a sample of 2N alleles is:

Example: Expected number of alleles in a sample of 4:

eN4Population mutation rate: index of variability of population:

Page 7: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Ewens Sampling Formula Predicts number of

different alleles that should be observed in a given sample size if neutrality prevails under Infinite Alleles Model

Small θ, E(n) approaches 1

Large θ, E(n) approaches 2N

θ can be predicted from number of observed alleles for given sample size

Can also predict expected homozygosity (fe) under this model

12...

211

)(12

0

N

inE

N

i

where E(n) is the expected number of different alleles in a sample of

N diploid individuals, and = 4Ne.

1

1

14

1

ee N

f

Page 8: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Ewens-Watterson Test

Compares expected homozygosity under the neutral model to expected homozygosity under Hardy-Weinberg equilibrium using observed allele frequencies

Comparison of allele frequency distributions

fe comes from infinite allele model simulations and can be found in tables for given sample sizes and observed allele numbers

2iHW pf

Page 9: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Ewens-Watterson Test Example

Drosophila pseudobscura collected from winery

Xanthine dehydrogenase alleles

15 alleles observed in 89 chromosomes

fHW = 0.366

Generated fe by simulation: mean 0.168

feHartl and Clark 2007

How would you interpret this result?

Page 10: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Most Loci Look Neutral According to Ewens-Watterson Test

Exp

ecte

d H

omoz

ygos

ity

f e

Hartl and Clark 2007

Page 11: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

DNA Sequence Polymorphisms DNA sequence is ultimate view of standing genetic

variation: no hidden alleles

Is this really true?

What about back mutation?

Signatures of past evolution are contained in DNA sequence

Neutral theory presents null model

Departures due to:

Selection

Demographic events

- Bottlenecks, founder effects- Population admixture

Page 12: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Sequence Alignment Necessary first step for comparing sequences

within and between species

Many different algorithms

Tradeoff of speed and accuracy

Page 13: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Quantifying Divergence of Sequences

Nucleotide diversity (π) is average number of pairwise differences between sequences

ijij

ji ppN

N

1

where

N is number of sequences in sample,

pi and pj are frequency of sequences i and j in

the sample, and

πij is the proportion of sites that differ between

sequences i and j

Page 14: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Sample Calculation of π

A->B, 1 differenceA->C, 1 differenceB->C, 2 differences

5 10 15 20 25 30 35A

B

C

01867.0

)35/2)(33.0)(33.0()35/1)(33.0)(33.0()35/1)(33.0)(33.0(2

3

ijij

ji ppN

N

1

On average, there are 18.67 polymorphisms per kb between pairs of haplotypes in the population

Page 15: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Tajima’s D Statistic

Infinite Sites Model: each new mutation affects a new site in a sequence

Expected number of polymorphic sites in all sequences:

mE

)(

SaSE 1)(

1

11

1n

i ia

1a

SS

eN4where m is length of sequence, and

where n is number of different sequences compared

m

Page 16: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Sample Calculation of θS

Two polymorphic sitesS=2

5 10 15 20 25 30 35A

B

C

5.12

1

1

111

11

n

i ia 33.1

5.1

2

1

a

SS

01867.0 65.0)35)(01867.0( m

Page 17: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Tajima’s D Statistic Two different ways of estimating same

parameter:

Deviation of these two indicates deviation from neutral expectations

m 1a

SS

Sd

)(dV

dD where V(d) is variance of d

Page 18: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Tajima’s D Expectations

D=0: Neutrality

D>0

Balancing Selection: Divergence of alleles (π) increases

OR

Bottleneck: S decreases

D<0

Purifying or Positive Selection: Divergence of alleles decreases

OR

Population expansion: Many low frequency alleles cause low average divergence

Sd

Page 19: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Balancing Selection

Balancing

selection

‘balanced’ mutation

Neutral mutation

Slide adapted from Yoav Gilad

Should increase nucleotide diversity ()

Decreases polymorphic sites (S) initially.

D>0Sd

Page 20: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Recent Bottleneck

Rare alleles are lost Polymorphic sites (S) more severely

affected than nucleotide nucleotide diversity ()

D>0 Standard neutral model

Sd

Page 21: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Positive Selection and Purifying Selection

sweep

S

Slide adapted from Yoav Gilad

Advantageous mutation

Neutral mutation Should decrease both

nucleotide diversity () and polymorphic sites (S) initially.

S recovers due to mutation recovers slowly: insensitive

to rare alleles D<0

s sTime

recovery

Sd

Page 22: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Standard neutral model

Often two main haplotypes, some

rare alleles

Rapid Population Growth will also result in an excess of rare alleles even for neutral

loci

Slide adapted from Yoav Gilad

Tim

e

Rapid population

size increase

Most alleles are rare eN4

Most alleles are rare Nucleotide diversity

() depressed Polymorphic sites

(S) unchanged or even enhanced : 4Neμ is large

D<0

Sd

Page 23: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

How do we distinguish these two forms of divergence (selection vs demography)?

Page 24: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Hudson-Kreitman-Aguade Test

Divergence between species should be of same magnitude as variation within species

Provides a correction factor for mutation rates at different sites

Complex goodness of fit test

Perform test for loci under selection and supposedly neutral loci

Page 25: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Polymorphism

Divergence

Neutral Locus Test Locus A

8/20 ≈ 3/8

Slide adapted from Yoav Gilad

Hudson-Kreitman-Aguade (HKA) test

Polymorphism: Variation within speciesDivergence: Variation between species

Page 26: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

Polymorphism

Divergence

Neutral Locus Test Locus B

8 3

20 19

8/20 >> 3/19

Slide adapted from Yoav Gilad

Hudson-Kreitman-Aguade (HKA) test

Conclusion: polymorphism lower than expected in Test Locus B: Selective sweep?

Page 27: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpg

Page 28: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

http://www.nsf.gov/news/mmg/media/images/corn-and-teosinte_h1.jpgMauricio 2001; Nature Reviews Genetics 2, 376

Teosinte

Maize Maize w/TBR mutation

Page 29: Lecture 21: Tests for Departures from Neutrality November 9, 2012.

HKA Example: Teosinte Branched

Lab exercise: test Teosinte-Branched Gene for signature of purifying selection in maize compared to Teosinte relative

Compare to patterns of polymorphism and diversity in Alchohol Dehydrogenase gene