Download - Microbial diversity measurement lecture - UNAM

Transcript
Page 1: Microbial diversity measurement lecture - UNAM

1

Species Divergence and theMeasurement of Microbial

Diversity

Cathy Lozupone

University of Colorado, Boulder.Washington University, St Louis.

Outline• Classes of diversity measures

– α vs β diversity– Quantitative vs Qualitative– Divergence/phylogenetic-based diversity vs Taxon/species

• Phylogenetic diversity measures that:– Compare the total amount of diversity between samples.

• e.g. Is a polluted lake less diverse than pristine?– Test if samples have significantly different membership.

• e.g. Do gut samples from HIV positive people have differentmicrobes than those from healthy people?

– Identify environmental variables associated with differencesbetween many samples.

• e.g. Does pH, organic carbon, soil type, etc correlate withvariability across many soils?

• These measures are not just for microbes!• Lozupone, C.A. and R. Knight (2008) Species divergence and the

measurement of microbial diversity. FEMS Microbiol Rev. 1-22.

How do we describe and comparediversity?

• α Diversity:– “How many species are in a sample?”

• (e.g. 6 colors in A and 6 in B)– e.g.: Are polluted environments less

diverse than pristine?• β Diversity:

– “How many species are sharedbetween samples?”

• (e.g. 2 shared colors between A and B)– e.g.: Does the microbiota differ with

different disease states?

A

B

Quantitative versus Qualitative measures• Qualitative: Considers presence absence only

– α: How many species are in a sample?• e.g.: 6 colors in both A and B.

– β: How many species are shared betweensamples?

• e.g.: A and B are identical because the same colorsare present in both.

• Quantitative: Also considers relative abundance.– α: Accounts for “evenness”:

• e.g. B, where the population is evenly distributedacross the 6 species, is more diverse than A, whereall species are present but red dominates.

– β: Samples will be considered more similar if thesame species are numerically dominant versusrare.

• e.g. B and A no longer look identical because ofdifferences in abundance.

A

B

Page 2: Microbial diversity measurement lecture - UNAM

2

What is a phylogenetic diversitymeasure?

• α Diversity:– Taxon: “How many species are in a

sample?”– Phylogenetic: “How much phylogenetic

divergence is in a sample?”• (e.g. B more individually diverse than A - more

divergent colors)

• β Diversity:– Taxon: “How many species are shared

between samples?”– Phylogenetic: “How much phylogenetic

distance is shared between samples?”• (only related colors from B are in A)

A

B

Advantages of phylogenetictechniques.

• Phylogenetically related organisms are more likelyto have similar roles in a community.

• Taxon-based methods assume a “star phylogeny,”where all relationships between taxa are ignored.

4. Diversity evaluation.

Taxon (Species)-based: Group sequences into

OTUs based on %identity. 97% id forspecies.

Phylogeny-based:

Easily applied to microbial communitysequence data.

1. Extract DNA fromenvironmental

samples.

2. PCR amplify SSUrRNA gene.

3. GenerateSequences:

SangerPyrosequencing

Most (>99%) microbes cannot be cultured.

Adapted from Pace 1997 Science 276:734-740.

Majority of phylogenetic diversity ismicrobial.

Page 3: Microbial diversity measurement lecture - UNAM

3

Phylogenetic Diversity Measures• α Diversity

– Phylogenetic Diversity (PD)– Compare the total amount of diversity between samples.

• β Diversity– Test if samples have significantly different membership.

• UniFrac Significance• P test• LibShuff

– Identify environmental variables associated with differencesbetween many samples.

• Unweighted and Weighted UniFrac• DPCoA

– Compare local and regional diversity• Gain in PD (G)• NRI-NTI

• Sum of branches leading to sequences in a sample.• Qualitative α diversity.• Sample with taxa spanning the most branch length in this

tree represents the most phylogenetically and perhapsfunctionally divergent community.

Phylogenetic Diversity (PD)

Faith, D.P. (1992) Conservation evaluation and phylogenetic diversity. Biological Conservation 61, 1-10.

PD Rarefaction

• Plot the amount of branch length against the # of observations.• Shape of curve allows for estimating how far we are from sampling all of

the phylogenetic diversity.• Allows for comparison of phylogenetic diversity between samples.

Eckburg, P.B., et al. (2005) Diversity of the human intestinal microbial flora. Science308, 1635-1638.

• Do two samples contain significantlydifferent microbial populations?

• Can we see broad trends that relatemany samples and explain them interms of environmental factors?

Phylogenetic β diversity: How isdiversity partitioned among

samples?

Page 4: Microbial diversity measurement lecture - UNAM

4

Unique Fraction (UniFrac) metric• Qualitative phylogenetic β diversity.• Distance = fraction of the total branch length that is

unique to any particular environment.

Lozupone Lozupone and Knight, 2005, and Knight, 2005, Appl Appl Environ Environ Microbiol Microbiol 71:822871:8228

Phylogenetic (P) Test

• The number of changesbetween states(samples) required toexplain the distributionof sequences on thetree (Fitch parsimony).

• Sensitive to treetopology but not tobranch lengths.

Martin, A.P. (2002) Phylogenetic approaches for describing and comparingthe diversity of microbial communities. Appl Environ Microbiol 68, 3673-3682.

“Is the phylogenetic diversity significantlydifferent between samples?”

• Monte Carlo simulations: randomly permute the data (environmentassignments) and determine how often the random data has a moreextreme value than the real data.

• P-values:– P-test: fraction of random trees that have less parsimony changes than the

real tree.– UniFrac: fraction of random trees that have more Unique branch length

than the real tree.UniFrac UniFrac Website: http:Website: http://bmf//bmf..coloradocolorado..edu/unifrac/edu/unifrac/

LibShuff• CX: fraction of

sequences in X that arenot singletons aftergrouping through rangeof sequence distances.

• CXY: fraction in X thatare also in Y

• Cramer-von Misesstatistic: distancebetween 2 curves.

• Significance with MonteCarlo.

• Comparison of Bacteriain two beetles species.

Singleton DR, Furlong MA, Rathbun SL & Whitman WB (2001). ApplEnviron Microbiol 67: 4374-4376.

Page 5: Microbial diversity measurement lecture - UNAM

5

Clustering with the UniFrac AlgorithmCan we see broad trends that relate many samples and explain them in

terms of environmental factors?

What types of environments have similarphylogenetic diversity?

Temperature

0-100°C

pH

1-12

NutrientAvailability

OligotrophicEutrophic

Pressure

1-200 atm

Lozupone CA & Knight R (2007) Global patterns in bacterialdiversity. Proc Natl Acad Sci U S A 104: 11436-11440.

Salinity is the most important factor

PCoA of UniFracDistance

Matrix

HierarchicalHierarchicalclusteringclustering(UPGMA)(UPGMA)

of the sameof the sameUniFrac UniFrac distancedistance

matrixmatrix

Page 6: Microbial diversity measurement lecture - UNAM

6

Qualitative vs Quantitative measures ofPhylogenetic β Diversity

• Qualitative:– Unweighted UniFrac– Detects factors restrictive for microbial growth.– High temperature, low pH, founder effects.

• Quantitative:– Weighted UniFrac, DPCoA.– Detects transient changes.– Seasonal changes, nutrient availability, response to

pollution.• Yield different, complementary results and

applying both to same data can provide insightinto nature of community changes.

Weighted UniFrac

Lozupone Lozupone et al.,et al., 2007. 2007. Appl Appl EnvironEnviron Microbiol Microbiol 73:157673:1576

Qualitative Quantitative

Obesity and Gut Microbiota• Mice

heterozygous formutation in Leptingene interbreed.

• 16S genesequenced forbacteria in gut ofmothers andoffspring.

Ley et al., (2005)Obesity Alters Gut Microbiota, PNAS Vol 102: pp 11070-11075

Clustering of Mouse Data

• Mice cluster perfectly by mother• No obvious effects of obesity• Robust to sampling effort

• Obese micemostly clustertogether

• Not robust tosampling effort

Weighted UniFracUnweighted UniFrac

Page 7: Microbial diversity measurement lecture - UNAM

7

Unweighted UniFrac

Weighted UniFrac

Eckburg, P.B., et al. (2005) Diversity of the human intestinal microbial flora. Science308, 1635-1638.

• Unweighted: allsamples cluster byindividual.

• Weighted: stool looksdifferent.

Comparison of humanstool and mucosal

microbes

Measures in the same classcluster the data similarly

• Double principal coordinatesanalysis (DPCoA)– Another quantitative β diversity

measure.– A matrix of species distances is first

used to ordinate the species usingPCoA.

– The position of the communities incoordinate space is the averageposition of the species that theycontain, weighted by relativeabundances.

• Produces same results as weightedUniFrac.

Short reads (pyrosequencing)can recapture the result.

• UW UniFracclustering with Arbparsimony insertionof 100 bp readsextending fromprimer R357.

• Assignment ofshort reads to anexisting phylogeny(e.g. greengenescoreset) allows forthe analysis of verylarge datasets.

Liu Z, Lozupone C, Hamady M, Bushman FD & Knight R (2007) Shortpyrosequencing reads suffice for accurate microbial community analysis. NucleicAcids Res 35: e120.

Comparison of Local Diversityto Regional Diversity

• β-diversity measures can also relate diversityin a single community to the total diversity ina habitat type or globally.

• Net Relatedness Index (NRI) and NearestTaxa Index (NTI)– Webb CO (2000) Exploring the phylogenetic structure

of ecological communities. Am Nat 156: 145-155.– Overdispersion of sequences in the tree:

Competition important.– Underdispersion of sequences: Habitat Filtering

important.• Gain in PD (G)

Page 8: Microbial diversity measurement lecture - UNAM

8

• Which communities contain the most unseen diversity?• Branches leading only to sequences in a sample.

Gain in PD (G)

Faith, D.P. (1992) Conservation evaluation and phylogenetic diversity.Biological Conservation 61, 1-10.

Correcting G for sampling effort• Regression of G

values vs # ofOTUs detectedin sample.

• Culture basedstudies (red)discovered littlenew diversity.

• Unique salineenvironments(e.g. hypersalinemats) discoveredmore.

Lozupone CA & Knight R (2007) Global patterns in bacterialdiversity. Proc Natl Acad Sci U S A 104: 11436-11440.

Summary• Phylogenetic diversity measures can be more powerful

than taxon based measures because they useinformation on how closely related taxa are to eachother.

• Phylogenetic measures are available for both αdiversity and β diversity.

• Quantitative and Qualitative beta diversity measuresproduce complementary insights into how communitiesare related.

• Although several different methods may exist for aparticular class of diversity measure - these are likelyto give similar results (e.g. DPCoA and WeightedUniFrac).

Acknowledgments

• Rob Knight• Micah Hamady• Knight and Gordon Labs