Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

96
Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1

Transcript of Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Page 1: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

1

Single Nucleotide Polymorphisms

Arthur M. LeskBologna Winter School 2011

Page 2: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

What are SNPs and why are they important?

• SNP = Single nucleotide polymorphism, an isolated change in a single nucleotide

• SNPs are one type of mutation • Some have obvious functional consequences

• Sickle-cell haemoglobin: gag→gtg (β6 Gln→Val)• First “molecular disease” sickle-cell anaemia

• Some are ‘silent’• Some are in non-coding regions

• affect splice sites?• affect regulatory sites?• some have no known phenotypic effect

• 2

Page 3: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

3

What is a SNP?

• The genomes of individuals in a population contain a particular base at some position most of the time.

• That is, there is a “normal” sequence• A SNP is a deviation from the normal sequence.

– Many people require that a variation occur in at least 1% of the population, to be considered a SNP

• But: what population? What if two distinct populations have a consistent polymorphism?

Page 4: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

4

SNPs in human genomes

• SNPs are about 90% of all inter-human variation

• Occur on the average once in every 300 bases

• 2/3 of SNPs are C→T changes (perhaps because C can easily deaminate)

cytosine uracil→

Page 5: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

5

SNP density varies across human genome

• Some high-density patches• Some ‘deserts’• SNPs in coding regions ~1/3 as many as in

non-coding regions• SNP density correlated with recombination

rate (which causes which??)• AT microsatellites: long (AT)n repeat tracts

tend to appear in regions of low SNP density

Page 6: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Figure 14 SNP density in each 100-kbp interval as determined with Celera-PFP SNPs.

J C Venter et al. Science 2001;291:1304-1351

Published by AAAS

Page 7: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

7

What is normal?

• Obviously we all differ genomically• Swedes and Chinese have obviously different

phenotypes• Most Swedes and Chinese are healthy indviduals• Therefore genetic differences do not necessarily

cause disease• Pointless to check for differences from a single

‘reference sequence’• Of course, many genetic differences not just SNPs

Page 8: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

8

Variation in human and other species

• Any two humans ~99.5% identical in sequence• Chimpanzees, gorillas: twice as variable, despite

much smaller population size• Implies prehistoric bottleneck in human

population, recent common origin• Most SNPs (> 5%) shared among human

populations from around the world• Most populations (e.g. British) contain 85-90%

of all known variation

Page 9: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

9

Variation in human and other species

• Some variation is population-specific• In some cases, there is local selective pressure• For example, adult lactose tolerance, malaria

resistance• African populations have greatest genetic

diversity• Supports ‘Out of Africa’ theory of human

origin and migration

Page 10: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

10

Identification of geographical origin, phenotype

• A criminal leaves a blood sample at a crime scene

• How much can we tell about him or her?• Not perfectly, but:

– Ethnic group– Eye and hair colour (hair colour easier to change)– Family name?

Page 11: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

11

Types of SNPs

• Transitions:– purine ↔ purine– pyrimidine ↔ pyrimidine (cytosine→uracil)

• Transversions:– purine ↔ pyrimidine

• Transitions are more common than transversions

Page 12: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

12

Prevalence of SNPs in human genomes

• approximately 1 in 300 bp (0.001%)• compare difference between human /

chimpanzee genomes:• 4% different (not all SNPs!)

Page 13: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

13

‘Life cycle’ of a SNP

• Generation of a mutation• Initial survival, against ‘sampling loss’• Increase in frequency – survival until become

homozygous in some individuals; • chance of loss reduced

(helped by bottlenecks, founder effects – population size dependent)

• Fixation

Page 14: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

14

Initial survival of a SNP

• Suppose a person is heterozygous for a novel, selectively-neutral mutation.

• Suppose the person has 2 children that survive to reproductive age. The probability of loss of the mutation is 25%.

• If each descendant has 2 children that survive to reproductive age, probability of loss in 200 years = 94%

Page 15: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

15

Where do SNPs occur in the human genome?

• Distributed throughout the genome• 50% in non-coding regions

– NOT the same as non-functional!!!• 25% missense mutations (amino acid substitution)• 25% silent (amino acid unchanged)

– silent = no change in encoded amino-acid sequence– NOT the same as no phenotypic effect!!!– would be better to call them synonomous SNPs rather

than silent SNPs

Page 16: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

16

SNPs in non-human genomes

• Of course other species have SNPs• Here we will focus on human SNPs because of

relevance to human disease• However, SNPs in pathogens are sometimes

associated with antibiotic resistance, and therefore related to human disease

• SNPs in some plants give clues to domestication

Page 17: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

17

Organised efforts to collect SNPs

• The HapMap is a catalogue of common human genetic variants

• HapMap Project = international collaboration among Japan, the United Kingdom, Canada, China, Nigeria, and the United States

• NOT Europe• Carry out measurements, provide database• Other projects collect SNPs in other species

Page 18: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

18

HapMap project

• International consortium: International HapMap Project– http://hapmap.ncbi.nlm.nih.gov/

• Catalogue of human genetic variants :– What sites?– How distributed – frequency in different

populations– Raw material for linking genomics with disease

Page 19: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

19

Origin of samples

• Total of 270 people. • The Yoruba people of Ibadan, Nigeria• Japan (Tokyo)• China (Beijing)• U.S. residents with Northern and Western

European ancestry

Page 20: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

20

What is a haplotype?

• Often, a set of SNPs appear nearby on the same chromosome

• In absence of recombination, they will be inherited in blocks

• Pattern of SNPs in a block is called a haplotype• A block may contain many SNPs, but only a few

are needed to identify a haplotype• These signature SNPs within a haplotype block

are called `tag SNPs’

Page 21: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

21

http://www.riken.go.jp/engn/r-world/info/release/news/2003/nov/image/frol_06.gif

Page 22: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

22http://img.medscape.com/fullsize/migrated/553/400/ncpcard553400.fig1.gif

Page 23: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

23

Guide to SNP databases• SNPlinks: http://www.snpforid.org/snpdata.html• NCBI dbSNP

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=snp• The SNP Consortium

http://snp.cshl.org/• HapMap

http://www.hapmap.org/• Applied Biosystems http://myscience.appliedbiosystems.com/cdsEntry Assays-on-Demand

/Form/assay_search_basic.jsp• Ensembl http://www.ensembl.org/Homo_sapiens/• HGVBase

http://hgvbase.cgb.ki.se/• SeattleSNPs

https://gvs.gs.washington.edu/GVS/

Page 24: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

24

dbSNP database at NCBI

• non-redundant dataset• nomenclature: rs number• rs = reference SNP.

Page 25: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

General human mutations

• Human Gene Mutation Database http://www.hgmd.cf.ac.uk• over 100000 mutations, in 3700 genes• 6.2% of total ~23000 genes • about 10000 new mutations found per year• OMIM (Online Mendelian Inheritance in Man)

– database of mutations associated with human disease

• OMIA (Online Mendelian Inheritance in Animal)

Page 26: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

26

Databases with important related information

• Online Mendelian Inheritance in Man (OMIM) [NCBI]– Comprehensive compendium of human genes and

associated phenotypes– Not limited to SNPs

• SNPs3D http://www.snps3d.org/– SNPs3D assigns molecular functional effects to non-

synonymous SNPs based on structure and sequence analysis.

• SNPper http://snpper.chip.org/– Retrieve SNPs by position or gene association

Page 27: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

27

Quality of sequence information is important

• SNPs appear in human genome at approximately 1 in 300 bases

• Obviously error rate in resequencing must be substantially lower than this if SNP data are to be meaningful

• Measure of DNA sequencing quality: PHRED

Page 28: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

28

PHRED – measure of sequence quality

• Phred scores accepted to characterize the quality of DNA sequences

• Originally Phred was a program, that determined accurate quality scores indicating error probabilities.

• Accepted as general standard• Phred quality score Q. Let P = probability of

base errorQ = -10 log10 P

Page 29: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

29

Phred quality score Q

Probability of incorrect base call

Base call accuracy

10 1 in 10 90% 20 1 in 100 99% 30 1 in 1000 99.9% 40 1 in 10000 99.99% 50 1 in 100000 99.999%

Page 30: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

30

Phred quality score Q

Probability of incorrect base call

Base call accuracy

10 1 in 10 90% 20 1 in 100 99% 30 1 in 1000 99.9% 40 1 in 10000 99.99% 50 1 in 100000 99.999%

A method that gave an averaged phred score Q = 30 would give approximately

as many errors as there are SNPs!

Page 31: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

31

What can SNPs tell us?

• Causes of disease -- dysfunctional protein• Correlation with disease prognosis, success of

particular treatment• Useful genetic markers, to locate some gene of

phenotypic interest; for instance, a gene correlated with a disease

• Characterise individuals• Characterise populations (SNP distribution)• Applications in anthropology -- tracing of migrations,

human evolution

Page 32: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Use of SNPs as genetic markers

Before 1980, genetic maps were constructed by measuring recombination frequencies between genes giving measurable phenotypic traits

This goes back at least to Sturtevandt and Morgan, if not to Mendel

At that time, phenotypes were the only visible aspect of the genome

Page 33: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Use of SNPs as genetic markers In 1980, Botstein, Davis, Skolnick & White

proposed using polymorphic DNA markers for genetic mapping, even if they had no known phenotypic effect

Example: (then) restriction sites SNPs → restriction fragment length

polymorphisms (RFLPs) Did linkage mapping with restriction sites Now we can use SNPs

Page 34: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

34

Traits depending on multiple loci

• Use of SNPs to identify traits, including but not limited to diseases, that depend on multiple loci

• Single genes for diseases showing simple Mendelian inheritance (for instance, cystic fibrosis) can be isolated

• Diseases that depend on interaction with multiple loci can be studied with enough SNP linkage information

Page 35: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

35

SNPs tell us about human history

• Development of ability to digest lactose past infancy correlated with domestication of cattle, increased (non-fermented) dairy products in human diet

• Source of calcium and calories• Many Asian populations retain adult lactose

intolerance • Where do they get calcium?

“The soybean is the cow of Asia.”

Page 36: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

36

Ability to digest lactose in adulthood

• Digestion of lactose depends on enzyme lactase-phlorizin hydrolase, which catalyzes hydrolysis of lactose → glucose + galactose

Page 37: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

37

Ability to digest lactose in adulthood

• In many people, the ability to digest lactose is a juvenile characteristic

• Expression declines after age 2 – varies among individuals

• Consistent with lifestyle involving breast feeding until this age, followed by weaning followed by diet not including (non-fermented) milk and other dairy products– To form yoghurt, bacteria cleave lactose

Page 38: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

38

Evolution of adult lactase expression

• Domestication of cattle, with concomitant rise of milk in the diet, led to selective pressure for lactose tolerance

• Mutation arose among cattle-raising people:– the Funnel Beaker culture– north-central Europe ~5,000-6,000

years ago• Most common mutations in

Europeans: SNPs– C/T-13910– G/A-22018

• Not surprisingly, in control regions for lactase gene

Page 39: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

39

Prevalance of lactose-tolerance SNP

Danes and Swedes 90%Spanish and French 50%Chinese 1%

http:

//gs

eorla

ndo.

files

.wor

dpre

ss.c

om/2

010/

09/j

.jpg

Group Study Exchange

Page 40: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

40

Multiple development of lactose tolerance

• Development of lactose tolerance apparently appeared four times, independently– Europe: C/T-13910 and G/A-22018

• Pastoral areas of Africa – three independent mutations:– G/C-14010 East Africa– T/G-13915 North Sudan– C/G-13907 North Kenya

Page 41: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

41

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672153/bin/ukmss-4417-f0002.jpg

Page 42: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

42http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2672153/bin/ukmss-4417-f0001.jpg

Page 43: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

43

SNPs in anthropology

• Useful in tracing relationships between populations, migration routes

• Initially used mitochondrial DNA (16569 bp)• Maternal inheritance only

– (Y chromosome gives paternal inheritance only)• Important argument for “out of Africa” theory of

human origins and dispersal• Can choose non-selected regions, in contrast to

previous work on blood groups, MHC haplotypes

Page 44: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

44

Migration routes into Asia and the Pacific based on SNPs

http://i49.tinypic.com/2d0j2py.jpg

Page 45: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

45

DNA sequences and language groups

• Proposal by L. L. Cavalli-Sforza• Showed consistency between trees based on

genetic markers and trees based on linguistic groupings

• Controversial!• In some cases, genomics has confirmed

hypotheses of population affinity based on language similarity / dissimilarity

• Basques are outliers in both genes and language

Page 46: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

46

Recommended reading

Tomasz KamusellaThe Politics of Language and Nationalism in

Modern Central Europe Palgrave Macmillan, 2008

Page 47: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

47

What happens after invasions?

• Hungary invaded by Magyars in 896 AD. Country converted to speaking Uralic language

• Rome fell to vandals in 476 AD but did NOT impose their language. (Perhaps recognising superiority of Italian culture – which their descendants don’t)

• England invaded by Anglo-Saxons in about 5th century. Anglo-Saxon pushed Celtic languages to far reaches of British Isles + Brittany

• Norman invasion of 1066 did NOT entirely replace Anglo-Saxon by French.

Page 48: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

48

Possible effects of SNPs

• In protein-coding sequences – silent– missense– coding → stop codon– stop codon → coding– SNPs can → dysfunctional proteins

• In splice sites– 15% of disease-causing mutations in human genome are

point mutations in vicinity of mRNA splice junctions• In regulatory sequences

Page 49: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

49

What are possible effects of SNPs in coding sequences?

• Change in amino acid• Example: sickle-cell anaemia

• sense codon → stop codon – protein truncated

• stop codon → sense codon – protein extended

Page 50: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

50

Page 51: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

51

SNPs in coding regions can do more than change one amino acid

• Change of codon for an amino acid to STOP codon produces truncated protein– Example: common mutation causing phenylketonuria

• Change of STOP codon to codon for an amino acid produces extended protein

• Example: haemoglobin Constant Spring – α-chain variant – termination codon TAA is mutated to CAA (glutamine)– produces extension of haemoglobin α-chain from 142 to 172

amino acids– causes mild anaemia

Page 52: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

52

Possible consequences of silent (synonymous) SNPs

• Nothing detectable• Change in proportions of variable spliced

proteins• Change in stability of mRNA• Effect on protein folding (translational

pausing)

Page 53: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

53

SNPs can affect variable splicing

• Almost all multiexonic genes show variable splicing• Change in isoform can have severe effects• Susceptibility to West Nile Virus

– SNP in 2',5'-oligoadenylate synthetase-like gene common in susceptible individuals

– oligoadenylate synthetase implicated in viral resistance– SNP present in exonic splice enhancer– Increases level of truncated protein →

enhanced susceptibility to virus

Page 54: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

54

SNPs can affect mRNA stability

• Expression levels of proteins depend on mRNA half-life (among other things)

• ATP-binding-cassette (ABC) transporters are membrane proteins

• function in translocation of compounds out of cells

• Disease associates with SNP in this family

Page 55: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

55

Dubin-Johnson syndrome

• autosomal recessive disorder• increase in conjugated bilirubin• defect in hepatocyte secretion of conjugated bilirubin

into bile• many patients asymptomatic• hormonal birth control or pregnancy can → jaundice• Some cases caused by synonymous SNP in gene for

ABCC2→ increased mRNA stability→ increased expression levels

Page 56: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

56

Synonymous SNPs can affect protein folding and even native structure

• Synonymous SNPs do not affect amino acid sequence

• Therefore should not alter native structure• However, affects kinetics of folding• mRNA secondary structure affect translational

pausing

Page 57: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

57

Cotranslational folding is affected by translational pausing

• Can affect not only kinetics but tertiary structure• Example: SNPs in Multidrug Resistance1 MDR1

– Encodes P-glycoprotein, an ABC transporter– Function to pump molecules out, including chemotheraputic

agents used in cancer– Haplotype C1236T, G2677T (nonsynonymous), C3435T – Affects interactions of protein with:

• cyclosporine A -- fungal cyclic peptide, immunosuppressant, used post-transplant

• verapamil -- calcium channel blocker, used in treatment of high blood pressure

Ref: Kimchi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM, Ambudkar SV, Gottesman MM (2007). A "silent" polymorphism in the MDR1 gene changes substrate specificity. Science. 315, 525-528.

Page 58: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

58

Verapamil

Cyclosporin A

Page 59: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

59

References

• Kimchi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM, Ambudkar SV, Gottesman MM (2007). A "silent" polymorphism in the MDR1 gene changes substrate specificity. Science. 315, 525-528.

• Erratum in: * Science. 2007 Nov 30;318(5855):1382-3.

• Comment in: Science. 2007 Jan 26;315(5811):466-7.• Bioessays. 2007 Jun;29(6):515-9.• * Epilepsia. 2007 Dec;48(12):2369-70.

Page 60: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

60

Prediction of functional effects of non-synonymous SNPs

• PolyPhen: (EMBL, Heidelberg)http://coot.embl.de/PolyPhen/

• SNPs3D (Baltimore) http://www.snps3d.org/

• Pmut (Barcelona) http://mmb2.pcb.ub.es:8080/PMut/

• SIFT (University of Washington) http://blocks.fhcrc.org/sift/SIFT.html

• MAPP (Stanford)http://mendel.stanford.edu/SidowLab/downloads/MAPP/index.html

Page 61: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

61

Sorting Intolerant From Tolerant

• Database and server at University of Washington

• SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids

• Limited to non-synonymous SNPs (or more generally, amino acid substitutions)

Page 62: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

SNPs in Medicine

Genomic sequence analysis can provide a lot of information about health risks of any individual

So far, part of the problem is that sequences usually just give bad news

Indications of optimal therapy useful: the U.S. health care industry faces huge costs in treatment of side effects of medication

Page 63: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

63

SNPs and disease

• Some SNPs (and of course other mutations) are consistent with a healthy life, and typical life-span, provided the individual carries on a reasonable lifestyle.

• Some SNPs directly and unavoidably cause disease• Others cause disease only in combination with unusual lifestyle or

specific events– Example: fever in children with Z-mutation of α1-antitrypsin

– protein somewhat unstable, denatures and aggregates

– Essential to keep infants free of high fever

• In many cases we can’t tell extent of genetic basis of disease or how it interacts with environmental effects

Page 64: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Copy-number variations may mask disease genes

• Genes in which nonsense SNPs detected belong to gene families of higher than average size.

• Genetic robustness• Every individual is heterozygous for some

deleterious mutations that, if homozygous, would be lethal.

Page 65: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Interaction of SNPs with environment/experience

• α1-antitrypsin is a natural elastase inhibitor in the lung• elastase in lung protects against bacteria• inhibitor prevents elastase from acting on human

tissues, notably elastin in the lung• Z-mutation of α1-antitrypsin: glu342→lys• Causes enhanced risk of emphysema• Z-mutation + smoking = GUARANTEE of early death

from emphysema“Genetics loads the gun; environment pulls the trigger”

(J. Stern)

Page 66: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

66

Discussion of following diseases

• Sickle-cell anaemia• Phenylketonuria• Alzheimer’s disease• Cancer

Page 67: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

67

SNP causing disease: Sickle-cell anaemia

• β6Val→Gln creates hydrophobic (sticky) patch on surface of β chains of haemoglobin

• Common SNP: gag → gtg• causes aggregation of deoxyhaemoglobin

Page 68: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

68

Phenylketonuria• Inborn deficiency in phenylalanine

hydroxylase• Autosomal recessive

(12q24.1)• 1/10000

sufferers• 1/50 carriers• Subject of

neonatal screening in many countries

Page 69: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

69

Mutations causing PKU

• Phenylalanine hydroxylase is a tetramer• Known mutations include:

– Over 200 affecting catalysis– About 50 affecting regulation– About 10 affecting tetramerization(Some involve cofactor -- tetrahydrobiopterin -- processing)

• Most common mutation in Caucasians:– g→a in intron 12– causes truncation (sense codon to stop codon)– fails to tetramerize

• McGill database: http://www.pahdb.mcgill.ca/

Page 70: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

70

Testing for PKU

• Phenylalanine, and degradation products such as phenylpyruvate build up in blood and urine

(Phenylpyruvate is a ketone, hence the name of the disease.)

• Blood sample from neonate, mass spec to detect phe, tyr levels• Can also do genomic sequencing – detection of carriers,

counselling of potential parents

Page 71: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

71

Symptoms if untreated

• developmental defects:– mental retardation– microcephaly

• seizures

Page 72: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

72

Treatment• Low phenylalanine diet

– not entirely satisfactory (unpalatable?)– tricky to manage PKU women in pregnancy

• Gene therapy (works in mice …)• Enzyme replacement therapy

http:

//ne

wen

glan

dcon

sorti

um.o

rg/w

p-co

nten

t/up

load

s/20

09/1

2/PK

U-F

ood-

Dia

gram

-cop

y.jp

g

Page 73: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

73

PKU in pregnancy

• Remember that PKU is an autosomal recessive trait• A woman with PKU must be homozygous for

defective phenylalanine hydroxylase (not necessarily same mutation)

• If such a woman becomes pregnant, it is likely that the foetus is only a carrier (unless father also a carrier)

• Tricky to control phe levels in mother to give foetus adequate nutrition but not toxic levels

Page 74: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

74

Enzyme replacement therapy for PKU

(1) administer functional phenylalanine hydroxylase itself But: requires cofactor, complex regulatory controls

(2) phenylalanine ammonia-lyase• converts phenylalanine to trans-cinnamic acid• trans-cinnamic acid:

– has low toxicity and does not cause developmental defects– converted by liver to benzoic acid, detoxified and excreted in urine– stable

• phenylalanine ammonia-lyase found in many plants; and fungi, including yeasts

• Anabaena variabilis enzyme in phase II clinical trials

Page 75: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

75

Comparison of reactions catalysed by phenylalanine hydroxylase (PAH) and

phenylalanine ammonia-lyase (PAL)

http://www.nature.com/mt/journal/v10/n2/images/mt20041219f1.jpg

Page 76: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

76

Genomics of phenylalanine hydroxylase

• Rhesus macaque and chimpanzee phenylalanine hydroxylases differ from normal human PAH

• On difference: Human Y356 = H in macaque and chimp

• The mutant is in the list of mutations in the PKU database: http://www.pahdb.mcgill.ca/

• But chimps and rhesus macaques do not suffer from PKU

• Why not?

Page 77: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Alzheimer’s disease

• Loss of cognitive function, characterised by: – Loss of train of thought– Progressive memore problems– Miss important appointments

• Early-onset – Appears at age < 65

• Late-onset Alzheimer’s – most common type– Affects people over the age 65– ~50% of people over age 85 suffer from it.

• Familial Alzheimer’s – < 1% of cases, appears at age 40-60

Page 78: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Alzheimer’s disease

• Early-onset -- age < 65– associated with mutations in presenilin 1, presenilin 2

and amyloid precursor protein • Late-onset Alzheimer’s – age > 65

– most common type: ~50% of people over age of 85 suffer from it.

– Propensity associated with ApoE (apolipoprotein E SNPs)• Familial Alzheimer’s

– < 1% of cases, appears at age 40-60

Page 79: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

ApoE SNPs and risk of late-onset Alzheimer disease

• ApoE = apolipoprotein E– Gene on chromosome 19; therefore we have two alleles– Basic function: remove cholesterol from blood

• Four common alleles, differ by SNPs:– ApoE1 [minor variant], ApoE2, ApoE3 [~55%], ApoE4

• E3 most prevalent(“ normal”)• At least one E4 allele increased risk of Alzheimer’s• At least one E2 allele decreased risk of

Alzheimer’s

Page 80: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

ApoE alleles• ApoE = 317-residue protein• Four common ApoE alleles, differ by SNPs:

– ApoE1 = rs429358(C) + rs7412(T) [minor variant]– ApoE2 = rs429358(T) + rs7412(T) – ApoE3 = rs429358(T) + rs7412(C) [~55%]– ApoE4 = rs429358(C) + rs7412(C)

Allele 112 168E1 Arg CysE2 Cys CysE3 Cys ArgE4 Arg Arg

Page 81: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

SNPs and Cancer SNPs are relevant to cancer research and

treatment in several ways: Mutations detectable in the genome indicate

propensity for development of cancers Mutations in BRCA1 and BRCA2, as indicators for

likelihood of breast/ovarian cancer development probably best known

Sequence analysis can predict progression and outcome

Sequence analysis can help choose optimal treatment

Progression of tumour often involves mutations and divergence of cell lines

Page 82: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Formation of cancer associated with loss of genome integrity

Cancer results from accumulated mutations that break down the controls on cell growth

Three classes of genes can promote cancer: Genes that regulate cell proliferation Genes required for repair of DNA damage Genes that control apoptosis

Page 83: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Retinoblastoma

Rare childhood tumour of eye Sporadic / familial (30-40%) Characteristics of familial retinoblastoma: Early onset Multiple tumours Affect both eyes Autosomal dominant inheritance pattern

Page 84: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

‘Two-hit’ hypothesis

Non-familial cases require inactivation of both copies of retinoblastoma gene

Require separate and independent mutations Familial cases inherit one defective copy, one

functional copy That is, ‘first hit’ is inherited, all that is needed

is ‘second hit’

Page 85: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

85

Page 86: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

86

SNPs and cancer

• A number of genes are known as ‘tumour suppressor genes’– Well-known examples: BRCA1, BRCA2– Not all common mutations are SNPs

• Some SNPs in tumour suppressor genes cause predisposition to development of cancer

• Other SNPs correlated with – Progression of disease– Efficacy of certain drugs

Page 87: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

Tumour suppressor genes

Encode proteins that inhibit tumor formation Normal function: inhibit cell growth Mutations take “foot off cell-growth brake”

Page 88: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

88

Mutations in BRCA1 and BRCA2

• In general population: ~ 12% of women will develop breast cancer• Women with harmful mutation in BRCA1 or BRCA2: ~ 60% will develop breast cancer• In general population: ~1.4% of women will develop ovarian cancer• Women with harmful mutation in BRCA1 or BRCA2: ~ 15-40% will develop breast cancer

Page 89: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

89

Some common BRCA1 mutations• Varies with population: showing strong

‘founder’ effect• Many not SNPs• Not all of these are necessarily harmful

mutationsPopulation Common mutation

Ashkenazi Jewish 185delAG, 188del11, 5382insC

Italian 5083del19

African-Americans 943ins10, M1775R

Spanish R71G

French 3600del11, G1710X

French Canadians C4446T

Page 90: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

90

Other tumour suppressor genes correlated with predisposition to develop cancer

• TP53, PTEN, STK11/LKB1, CDH1, CHEK2, ATM, MLH1, and MSH2

• But BRCA1 and BRCA2 have the strongest correlation with predisposition to breast and ovarian cancer

• Importance of early detection in treatment of cancer

• At-risk individuals should be sure to undergo frequent checkups

Page 91: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

91

Pharmacogenomics• Tailoring of treatment to individual patient,

based on genetic sequences

Page 92: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

92

Pharmacogenomics

• Tailoring of treatment to individual patient, based on genetic sequences

• Choice of optimal drug, and dosage• Use of drugs inappropriate for the patient:

– risks side effects: discomfort or even death– loses time in treating a condition which may become

progressively worse– at best, wastes money and health care resources; may

require additional resources to cure side effects and more severe conditions

Page 93: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

93

Thiopurine methyltransferase

• Acute lymphoblastic leukemia is a childhood cancer treated by thiopurines

• Thiopurine methyltransferase breaks down the drugs

• Genetic variant leading to inactive enzyme threatens toxic levels of drug in patient

• Screening patients for deficiency allows monitoring to determine appropriate dosage levels

Page 94: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

94

Sensitivity to abacavir

• Abacavir used in treatment of AIDS

• 4-8% of patients have serious, potentially-fatal hypersensitivity reaction

• Hypersensitivity correlated with MHC allele HLA-B*5701

• Genetic screening can detect, guide treatment

Page 95: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

95

Cytochrome P450 and drug metabolism

• Cytochrome P450 is a family of enzymes in the liver• Responsible for metabolizing a wide variety of drugs• Variations in sequences affect activity of these

enzymes• Lowered activity or loss of activity can cause drug

toxicity• Genetic tests for variations in cytochrome P450 genes

warn of potential overdose dangers• Pharmaceutical companies screen compounds for

rates of metabolism by cytochrome P450 enzymes

Page 96: Single Nucleotide Polymorphisms Arthur M. Lesk Bologna Winter School 2011 1.

96

J.D. Watson – lessons from genome

• The sequence of J.D. Watson’s DNA has been determined.

• He is homozygous for an unusual allele of the important drug metabolizing cytochrome gene (CYP2D6)

• Individuals with his genotype metabolise some drugs more slowly than other people.

• Watson has been taking β blockers to lower his blood pressure.

• Side effect: made him unacceptably sleepy.• Now he is taking a lower dosage.