Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

34
Binnaz Yalçιn, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint Complex Trait Consortium Meeting, Oxford July 1 st 2003 These mice have gone off their cheese… A genetic basis for depression ? These mice have gone off their cheese… A genetic basis for depression ? Anxiety susceptibility in the HS Anxiety susceptibility in the HS mice: mice: How far are we from discovering a How far are we from discovering a QTG? QTG?

description

Anxiety susceptibility in the HS mice: How far are we from discovering a QTG?. These mice have gone off their cheese… A genetic basis for depression ?. Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint - PowerPoint PPT Presentation

Transcript of Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Page 1: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Binnaz Yalçιn, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan FlintComplex Trait Consortium Meeting, Oxford July 1st 2003

These mice have gone off their cheese… A genetic basis for depression ?These mice have gone off their cheese… A genetic basis for depression ?

Anxiety susceptibility in the HS mice: Anxiety susceptibility in the HS mice: How far are we from discovering a QTG?How far are we from discovering a QTG?

Page 2: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Distance (cM)

D1Mit264

D1Mit394

D1Imm103

D1Mit100

D1Mit423

D1Mit198D1Mit194

D1Mit102

D1Mit289

D1Mit369

-lo

g P

va

lue

95 % CI

0.8 cM

Fine-resolution mapping on mouse chromosome 1

Page 3: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

cR

cM

Mb

FISH

74.0

Markers

143.0 144.0 145.0 146.0 147.0 148.0

D1Mit423D1Mit100

D1Mit499D1Mit395 D1Mit101 D1Mit264

D1Mit194D1Mit102

D1Mit198

15.0 17.0 16.0 18.0

73.0 73.4 73.1 73.2 73.3 73.5 73.7 73.8 73.9

Mouse chromosome 1

0.8 cM

Page 4: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

MM

HA

P94

A 4.8-Mb high-resolution integrated BAC-based map

cR

cM

Mb

FISH

74.0

Markers

143.0 144.0 145.0 146.0 147.0 148.0

D1Mit423D1Mit100

D1Mit499D1Mit395 D1Mit101 D1Mit264

D1Mit194D1Mit102

D1Mit198

436B15

146B4

278M14

305E1

134A16

445F7

90A8

238L2

132C16

185E17

447B15

4K20

431N20

231L2

459A11

238K21

278P12

329H3

101B24

7I3

220K2

174G1

206E19

285F13

129N3

480H2

282N6

212I24

368O20

311I21

37J4

278M

14S

90A

8S

212I

24T

37J4

S

37J4

T10

1B24

S

4K20

T

231L

2S

278P

12T

Rgs

1p21

ex5

132C

16T

329H

3T

231L

2T

278M

14T

16.8

4FR

F7

15.0 17.0 16.0 18.0

73.0 73.4 73.1 73.2 73.3 73.5 73.7 73.8 73.9

Mouse chromosome 1

0.8 cM

Page 5: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

1. Find expressed sequence tags (ESTs) using BLAST alignment

2. Compare with other species

Approaches used to identify genes

Page 6: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

How many genes would you expect How many genes would you expect

in a 4.8-Mb regionin a 4.8-Mb region??

Page 7: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

B302775

B830045N13

• 2 unknown ESTs (B302775 and B830045N13), respectively CDC73 and retinoic acid inducible neural specific protein homologues.

Mb

143.0 144.0 145.0 146.0 147.0 148.0

B3Galt2

• B3Galt2 (Beta 1,3-Galactosyltransferase 2).

Only 10 expressed sequences found

Glrx2

• Glrx2 (Glutaredoxin 2) also known as thioltransferase.

SSA2

• SSA2 (Sjögren Syndrome Autoantigen).

UCHL5

• UCHL5 (Ubiquitin C-Terminal Hydrolase L5).

RGS18

RGS2

RGS13RGS1

• 4 RGS genes (Regulator of G protein Signalling).

Page 8: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Have we missed anyHave we missed any

expressed sequencesexpressed sequences ? ?

Page 9: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● It contains a similar number of genes with short intergenic regions.

● It spans 365-Mb which has been sequenced to over 95 % coverage.

The Fugu genome is ideal for gene discovery in vertebrates

Page 10: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● 4.8 Mb were aligned to the whole Fugu genome.

Mouse-Fugu comparison

● Significant hits were identified.

● Are there any new matches that are explained by unidentified expressed sequences?

● All the hits found correspond to the genes previously identified.

● We haven't missed any coding sequence.

Page 11: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Are there any variants inAre there any variants in

these genesthese genes??

Page 12: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● We sequenced all the genes we previously identified in each of the HS founder strains and also in 12 HS mice.

● We covered coding sequences for all the genes.

● All RGS genes were fully sequenced including 4 Kb in the 5’ UTR and 2Kb in the 3’ UTR.

Identification of variants

Page 13: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Sequencing results for RGS2 RGS1 and RGS18

SNP del/ins repeats

RGS1822592 bp49 polymorphims100 % coverage

Structure

Coverage

Polymorphisms

Scale0 5.0 12.5 25.02.5 7.5 10.0 17.515.0 20.0 22.5

1 2 3 4 5

Coding variants

Exons

Coverage

Structure

Scale

Coverage

0 2.0 4.0 6.0

Polymorphisms

8.0

RGS27145 bp

22 polymorphisms

100 % coverage

1 2 3 4 5

RGS17368 bp96 polymorphisms100 % coverage0 2.0 5.0 10.01.0 3.0 4.0 7.06.0 8.0 9.0

Structure

Coverage

Polymorphisms

Scale

1 2 3 4 5

Page 14: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Summary of gene sequencing

• We sequenced 100 Kb in each of the 8 HS founders and in 12 HS mice.

• We found 296 polymorphisms.• 81% were SNPs, 13 % repeats and 6% ins/del.• Average polymorphism rate is 1 per 200 bp.• We observed segments of high (1 per 50 bp)

and low (1 per 500 bp) polymorphism rates.• All the polymorphisms found in the HS founders

are also present in the HS mice.

Page 15: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Symbol Length (bp) Coverage(%) Total variants 5' UTR Intronic CodingBC027756 93399 10 6 6B3Galt2 1300 100 0 0Glrx2 7150 50 8 7 1SSA2 21407 50 20 17 3UCHL5 29325 35 11 11B830045N13 387717 3 7 7RGS2 7145 100 22 13 9RGS13 45502 100 77 18 59RGS1 7368 100 96 39 53 4RGS18 22592 100 49 10 39

296 80 208 8

Coding variants identified in 10 genes

Page 16: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Does the variant alter protein function?

Gene exon Variant Polyphen SIFTGlrx2 2 I20I Silent SilentSSA2 2 V167A Benign TolerantSSA2 8 A461T Benign TolerantSSA2 8 V465I Benign TolerantRGS1 1 F6F Silent SilentRGS1 3 I60M Benign TolerantRGS1 4 R88K Benign TolerantRGS1 5 K186K Silent Silent

Page 17: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● 0.8 cM contains 4.8 Mb DNA.

Summary

● 10 genes were identified in 4.8 Mb.

● 3 genes have coding variants, none of which are predicted to alter the gene’s function.

● We cannot find any mutations that disrupt gene function.

Page 18: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

How can we identify How can we identify

functionallyfunctionally

important non-coding variants important non-coding variants??

Page 19: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Mouse-human comparison

Page 20: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● We found over 600 conserved non-coding regions using 70% identity over 100 bp regions.

● We sequenced 20% of the conserved non-coding regions, representing 120 Kb of sequencing in each of the HS founder strains.

● Extrapolating, we predicted that there are over 1000 polymorphisms in the 4.8 Mb region.

Sequencing conserved non-coding regions

Page 21: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

What is the arrangement of What is the arrangement of

polymorphisms across the genomespolymorphisms across the genomes

of the 8 HS foundersof the 8 HS founders ? ?

Page 22: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● Primers spaced on average every 5-10 Kb.● All polymorphisms detected by sequencing.● 1219 polymorphisms found including 76 %

SNPs, 14% del/ins and 10 % repeat polymorphisms.

● Average polymorphism density is 1 per 5 Kb.

Polymorphisms found in the HS founders

Page 23: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Examples of pairwise comparison of inbred strains

0

20

40

60

80

100

0 5 10 15 20 25 30 35 40 45 50Physical distance

Num

ber

of v

aria

nts/

100

Kb AJ/C57

0

20

40

60

80

100

0 5 10 15 20 25 30 35 40 45 50

BALB/C57

Physical distance

Num

ber

of v

aria

nts/

100

Kb

0

20

40

60

80

100

120

0 5 10 15 20 25 30 35 40 45 50

I/RIII

Physical distance

Num

ber

of v

aria

nts/

100

Kb

Page 24: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Summary of variants found

● 8 in coding regions.● 80 in 5’ UTR.● 208 in introns.● 1000 in conserved non-coding regions.● 713 in non-conserved regions.

Page 25: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

What is the probabilityWhat is the probability

that a variantthat a variant

influences the phenotypeinfluences the phenotype ? ?

Page 26: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● We originally identified QTL by testing for differences between the 8 HS founder strains, allowing each strain to have a different trait value.

● But a SNP merges the founder strains into two groups.

● If the SNP is the QTN then forcing those strains within a group to have the same trait value in the statistical test will be as good.

● If the test is non-significant then we can exclude that SNP as candidate.

Assigning probabilities to variants

Page 27: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

MOUSE RGS2 SEQUENCING INBRED STRAINS HAPPY

Region Position Type A/J AKR BALB C3H C57 DBA I RIII P-VAL

5' UTR -3045 (G)n x8 x8 x11 x8 x11 x11 x11 x11 6.46-10

5' UTR -3035 SNP G G A G A A A A 6.46-10

5' UTR -2986 (T)n x8 x8 x9 x8 x9 x9 x9 x9 6.46-10

5' UTR -2951 SNP G G A G A A A A 6.46-10

5' UTR -2854 (GTTTT)n x5 x5 x6 x5 x6 x6 x6 x6 6.46-10

5' UTR -2545 SNP G G C G C C C C 6.46-10

5' UTR -2359 (T)+ yes yes no yes no no no no 6.46-10

5' UTR -2347 (CG)+ yes no no yes no no no no 1.83-06

5' UTR -2117 SNP G G A G A A A A 6.46-10

5' UTR -1973 SNP G G T G T T T T 6.46-10

5' UTR -1888 SNP A A C A C C C C 6.46-10

5' UTR -1673 (A)n x14 x14 x20 x14 x20 x20 x20 x20 6.46-10

5' UTR -1916 SNP G G A G A A A A 6.46-10

Intron1-2 192 SNP T T C C C C C C 5.37-03

Intron1-2 241 SNP T T C T C C C C 6.46-10

Intron1-2 267 SNP C C T C T T T T 6.46-10

Intron1-2 653 (CA)n x11 x11 x24 x11 x24 x24 x24 x24 6.46-10

Intron1-2 1058 (T)n x9 x9 x10 x9 x10 x10 x10 x10 6.46-10

Intron2-3 1266 SNP G G T G T T T T 6.46-10

Intron3-4 1711 (T)n x4 x4 x5 x4 x5 x5 x5 x5 6.46-10

Intron3-4 1750 SNP T T C T C C C C 6.46-10

Intron4-5 2159 SNP A A G A G G G G 6.46-10

3' UTR 3297 SNP A A G A G G G G 6.46-10

MOUSE RGS13 SEQUENCING INBRED STRAINS HAPPY

Region Position Type A/J AKR BALB C3H C57 DBA I RIII P-VAL

5' UTR -4922 (A)n x8 x9 x10 x8 x10 x10 x10 x8 9.03-015' UTR -4697 SNP T C T T T T T T 2.95-015' UTR -4062 (A)n x4 x8 x6 x4 x6 x6 x6 x4 9.03-015' UTR -4042 (CAAA)n x5 x4 x5 x5 x5 x5 x5 x7 4.62-035' UTR -4027 (A)n x13 x8 x13 x13 x13 x13 13 x5 4.62-035' UTR -4026 (C)- no yes no no no no no no 2.95-015' UTR -3820 SNP C T C C C C C C 2.95-015' UTR -3725 SNP C G C C C C C C 2.95-015' UTR -3566 SNP A G A A A A A A 2.95-015' UTR -3374 SNP G A G G G G G G 2.95-015' UTR -3284 SNP G A G G G G G G 2.95-015' UTR -2778 SNP T T G T G G G T 3.40-015' UTR -2754 SNP G A G G G G G G 2.95-015' UTR -2665 (TAGA)n x7 x4 x4 x7 x4 x4 x4 x7 7.80-015' UTR -2524 SNP T T C T C C C T 3.40-015' UTR -2181 SNP A A T A T T T A 3.40-015' UTR -1947 SNP T C C T C C C T 7.80-015' UTR -1655 SNP C T C C C C C C 2.95-015' UTR -613 (CA)n x7 x8 x7 x7 x7 x7 x7 x7 2.95-01

Page 28: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

HAPPY results across our whole region

Physical distance (Mb)

0

5

10

15

20

25

142.5 143 143.5 144 144.5 145 145.5 146 146.5 147 147.5

Megabase

Lo

gP

Page 29: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

HAPPY results across our whole region

Physical distance (Mb)

0

5

10

15

20

25

142.5 143 143.5 144 144.5 145 145.5 146 146.5 147 147.5

Megabase

Lo

gP

SSA2B8

B3B3Galt2

Glrx2

UCHL5

RGS2

RGS1RGS18RGS13

Page 30: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Most significant SNPs lie within a conserved non-coding region

Page 31: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● We can exclude 77% of the SNPs identified that are not significant.

● Among coding variants none is significant.● Among 5’ UTR regions 17 are significant.● We can further exclude another 13 % which

lie under non-conserved regions.● This identifies 120 SNPs as significant.

How many variants could we exclude?

Page 32: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● There are no obvious coding variants that are the QTN.

● Haplotype analysis can help limit the search but involves immense amounts of sequencing.

● There may not be a single responsible variant.

● One region, 5’ of the RGS18 gene contains the most significant SNPs, within a conserved non-coding region

Conclusions

Page 33: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

● Jonathan Flint● Richard Mott● Jan Fullerton● Sue Miller

● Andrew Morris● Richard Copley● John Broxholme

Acknowledgements

Page 34: Binnaz Yal ç ι n, Jan Fullerton, Sue Miller, Richard Copley, Richard Mott and Jonathan Flint

Acknowledgements