Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of...

55
Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison and Alignment
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of...

Page 1: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Nick Heppenstall (biology)

Michal Dvir (Mathematics/CS)

Andrew Dittmore (physics)

Under guidance of Dr. Yung-Pin Chen (Mathematics)

DNA Sequence Comparison and Alignment

Page 2: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Outline

• Pi in base 4

• DNA Overview

• Markov Chains & Models

• Sequence Alignment

• Future Plans

Page 3: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

in base 103.14159…

3.021003331…ππ in base 4

Page 4: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Looking at π in base 4, the chance of seeing 2 is 1:422 is 1:16222 is 1:642222 is 1:256

The Normality of Pi

Page 5: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

3, 0, 2, 1, 0, 0, 3, 3, 3, 1, 2, 2, 2, 2, 0, 2, 0, 2, 0, 1, 1, 2, 2, 0, 3, 0, 0, 2, 0, 3, 1, 0, 3, 0, 1, 0, 3, 0, 1, 2, 1, 2, 0, 2, 2, 0, 2, 3, 2, 0, 0, 0, 3, 1, 3, 0, 0, 1, 3, 0, 3, 1, 0, 1, 0, 2, 2, 1, 0, 0, 0, 2, 1, 0, 3, 2, 0, 0, 2, 0, 2, 0, 2, 2, 1, 2, 1, 3, 3, 0, 3, 0, 1, 3, 1, 0, 0, 0, 0, 2,0, 0, 2, 3, 2, 3, 3, 2, 2, 2, 1, 2, 0, 3, 2, 3, 0, 1, 0, 3, 2, 1, 2, 3, 0, 2, 0, 2, 1, 1, 0, 1, 1, 0, 2, 2, 0, 0, 2, 0, 1, 3, 2, 1, 2, 0, 3, 2, 0, 3, 1, 0, 0, 0, 1, 0, 3, 1, 3, 1, 3, 2, 3, 3, 2, 1, 1, 1,0, 1, 2, 1, 2, 3, 0, 3, 3, 0, 3, 1, 0, 3, 2, 2, 1, 0, 0, 3, 0, 1, 2, 3, 0, 3, 0, 0, 0, 2, 2, 3, 0, 0, 2, 2, 1, 2, 3, 1, 3, 3, 0, 2, 1, 1, 3, 3, 0, 1, 1, 0, 0, 3, 1, 3, 1, 0, 3, 3, 3, 2, 0, 1, 0, 3, 1, 1, 1, 2, 3, 1, 1, 2, 3, 1, 1, 1, 0, 1, 3, 0, 0, 2, 1, 0, 1, 1, 3, 2, 1, 0, 2, 0, 1, 1, 2, 3, 1, 1, 1, 3, 1, 2, 1, 2, 0, 2, 1, 1, 3, 2, 1, 3, 3, 2, 3, 0, 1, 2, 3, 3, 1, 0, 1, 0, 3, 0, 1, 0, 0, 2, 3, 2, 2, 1, 2, 2, 1, 2, 0, 3, 1, 3, 3, 2, 3, 1, 1, 2, 2, 3, 0, 0, 2, 3, 3, 3, 3, 3, 1, 1, 3, 0, 2, 3, 1, 2, 3, 3, 1, 0, 0, 0, 1, 2, 2, 3, 1, 3, 3, 2, 3, 1, 3, 2, 3, 2, 0, 3, 2, 0, 1, 2, 2, 3, 3, 3, 2, 3, 1, 1, 2, 2, 2, 0, 2, 1, 2, 1, 3, 3, 2, 2, 1, 1, 2, 2, 3, 2, 2, 1, 3, 3, 0, 2, 1, 0, 0, 1, 0, 1, 1, 3, 3, 0, 1, 0, 2, 3, 0, 1, 3, 3, 3, 2, 1, 2, 1, 0, 2, 1, 0, 2, 2, 0, 1, 2, 1, 2, 1, 1, 0, 1, 3, 2, 3, 0, 3, 2, 1, 0, 1, 1, 2, 3, 0, 3, 3, 1, 3, 0, 0, 2, 0, 0, 0, 0, 1, 3, 3, 0, 2, 3, 2, 0, 2, 2, 0, 1, 1, 2, 0, 3, 2, 3, 3, 3, 0, 0, 1, 1, 2, 1, 2, 0, 3, 1, 2, 2, 1, 0, 2, 0, 0, 3, 1, 2, 0, 1, 3, 0 . . .

Digits of π in base 4:

Page 6: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

First 5000 digits of π in base 4.

Page 7: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

t,a,g,t,a,a,a,a,t,t,a,a,a,t,t,a,a,t,t,a,t,a,a,a,a,t,t,a,t,a,t,a,t,a,t,a,a,t,t,t,a,c,t,a,a,c,t,t,t,a,g,t,t,a,g,a,t,a,a,a,t,t,a,a,t,a,a,t,a,t,a,t,a,a,g,t,t,t,t,a,g,t,a,c,a,t,t,a,a,t,a,t,t,a,t,a,t,t,t,t,a,a,a,t,a,t,t,t,t,a,t,t,t,a,g,t,g,t,c,t,a,g,a,a,a,a,a,a,a,t,g,t,g,t,a,a,c,c,c,a,t,g,a,c,t,g,t,a,g,g,a,a,a,c,t,c,t,a,ga,g,g,g,t,a,a,g,a,a,a,g,a,t,c,g,a,t,c,g,c,t,t,t,a,t,a,g,a,g,a,c,c,a,t,c,a,g,a,a,a,g,a,g,g,t,t,t,a,a,t,a,t,t,t,t,t,g,t,g,a,g,a,c,c,a,t,t,g,a,a,g,a,g,a,g,a,a,a,g,a,g,a,a,a,g,a,g,a,a,t,a,a,a,a,a,t,a,t,t,t,t,a,g,t,g,a,c,t,c,c,a,tc,a,g,a,a,a,g,a,g,g,t,t,t,a,a,t,a,t,t,t,t,t,g,t,g,a,g,a,c,c,a,t,t,g,a,a,g,a,g,a,g,a,a,a,g,a,g,a,a,a,g,a,g,a,a,t,a,a,a,a,a,t,a,t,t,t,t,a,g,t,g,a,c,t,c,ca,t,c,a,g,a,a,a,g,a,g,g,t,t,t,a,a,t,a,t,t,t,t,t,g,t,g,a,g,a,c,c,a,t,c,g,a,a,g,a,g,a,g,a,a,a,g,a,g,a,a,t,a,a,a,a,a,t,a,t,t,t,t,t,g,t,a,a,a,a,c,t,t,t,t,t,t,a,t,g,a,g,a,c,c,a,t,t,g,a,a,g,a,g,a,g,a,a,a,g,a,g,a,a,t,a,a,a,a,a,t,a,t,t,tt,t,g,t,a,a,a,a,c,t,t,t,t,t,t,a,t,g,a,g,a,c,c,a,t,t,g,a,a,g,a,g,a,g,a,a,a…

Bases of the cowpox genome:

Page 8: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

First 5000 bases of the cowpox genome.

Page 9: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

*H.T. Chang, N Lo, W. Lu, C.J. Kuo, “Visualization and Comparison of DNA Sequences by Use of Three-Dimensional Trajectories.”

Pi (random) DNA

Three-Dimensional Trajectories*

Page 10: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

DNA

• Deoxyribonucleic Acid

• Double helix

• Chain of nucleotide subunits

• Four bases in DNA (A,T,C,G)

• Hold information for maintaining life

• Passed from parent(s) to offspring

Page 11: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Mutations

• Single base substitutions

• Insertions/Deletions

• Duplications

• Translocations

• Inversions

…ACT CCT GAG GAG……ACT CCT GTG GAG…

Thr Pro Glu Glu

Thr Pro Val Glu

…ACT CCT GAG GAG……ACT CCT GAG TAG

Thr Pro Glu Glu

Thr Pro Glu STOP

…ACT CCT GAG GAG……ACT CCT GAG GAA…

Thr Pro Glu Glu

Thr Pro Glu Glu

• Environmental factors

• Copying errors

Page 12: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

DNA sequence comparison

• Homologous genes

• Conserved sequences

• Identify mutations

• Forensics

• Evolution

QUANTITATIVE!

Page 13: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Markov Chain

Definition:A collection of random variables having the property that, given the present, the future is conditionally independent of the past.

CountryCity

0.05

0.03

0. 95 0. 97

Example: Annual percentage migration between city and country

Page 14: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Hidden Markov Model

A Hidden Markov Model is a Markov chain, where each state (City/Country) generates an observation or emission (Pet). The state can be predicted by observing emissions.

Cow 0.5Dog 0.3Cat 0.1

None 0.1

Cow 0.0Dog 0.1Cat 0.4

None 0.50.05

0.03

0. 95 0. 97

Example: Annual percentage migration between city and country

City Country

Page 15: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

HMM: State Transitions

Match

Mismatch

InDel

States: Match, Mismatch and Indel

Page 16: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

HMM: Emissions

Match

Mismatch

InDel

A

C

G

T

A/-

C/-

G/-

T/-

A/C

A/G

A/T

C/G

C/T

G/T

Emissions: A, C, G and T

Page 17: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Alignment/Comparison

Types of alignment• Local

• Global

• Gapped

• Ungapped

Mutations are recorded in DNA• Allow for comparison/alignment

Page 18: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Scoring matrices

A C G T

A 1 0 0 0

C 0 1 0 0

G 0 0 1 0

T 0 0 0 1

Gap = -1

Page 19: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Local alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: GAGCAAA

Page 20: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Local alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: GAGCAAA|

Score: 0+1+0+0+0+0+0 = 1

Page 21: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Local alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: GAGCAAA|

Score: 0+0+1+0+0+0+0 = 1

Page 22: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Local alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: GAGCAAA|

Score: 0+0+1+0+0+0+0 = 1

Page 23: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Local alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: GAGCAAA|||||||

Score: 1+1+1+1+1+1+1= 7

Page 24: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Global alignment

Human: TATGGTGGCGAGCAAACGTTGCGTGCGTA

Mouse: CATTGTGGTGAGCAAAGCGGTGGGCGGGTA

Page 25: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Global alignment

Human: TATGGTGGCGAGCAAACGTTGCGCGTGTA

Mouse: CATTGTGGTGAGCAAAGCGGTGGGCGTGTA || |||| ||||||| |

14 matches

16 mismatches

Score: 14(1)+16(0) = 14

Page 26: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Global alignment

Human: TATGGTGGCGAGCAAA-CGTTGCGCGTGTA

Mouse: CATTGTGGTGAGCAAAGCGGTGGGCGTGTA || |||| ||||||| || || |||||||

24 Matches

5 Mismatches

1 Indel

Score: 24(1)+5(0)+1(-1) = 23

Page 27: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

The scoring problem

Alignment

Scoring matrix

Page 28: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

What if we align the DNA sequence to a

model, instead of another sequence?

Our Solution

Page 29: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Why is this a solution?

Start with an initial model with equally likely probabilities. Then modify the model recursively using one or more parent sequences. The initial model is updated to replace the random probabilities.

1/3 1/3 1/3

1/3 1/3 1/3

1/3 1/3 1/3

0.92 0.03 0.05

0.18 0.69 0.13

0.14 0.19 0.67Modification

Recursive

Page 30: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

How does it score?

1. Modification number2. Length of original sequence3. Transition matrix4. Each emission matrix

The Model: ACTGTGTAG

1. Match/Match2. Match/Mismatch3. Match/Indel4. Mismatch/Match

.

.

.

Without knowing the initial state, the algorithm checks all possible state transitions and emissions for a best fit to the model.

Page 31: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

How does it score?

ACTGTGTAG

1. Modification number2. Length of original sequence3. Transition matrix4. Each emission matrix

The Model:

Now the previous state is defined, so we have only 3 possible transitions to consider.

1. Match/Match2. Match/Mismatch3. Match/Indel

Page 32: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

How does it score?

1. Modification number2. Length of original sequence3. Transition matrix4. Each emission matrix

The Model: ACTGTGTAG

This process will continue through the sequence, calculating the score and remembering the best fit to the model.

1. Mismatch/Match2. Mismatch/Mismatch3. Mismatch/Indel

Page 33: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Future Plans

Create working Hidden Markov Model.

Find convergence as the Model is modified.

Apply similar model to codon analysis.

Develop DNA trajectories as an alternativeapproach to sequence comparison.

Page 34: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Modeling DNA with a Tetrahedron

Page 35: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

G

C

T

A

Directional Vectors

Page 36: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

G

C

T

A

AGTTCG

Page 37: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 38: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 39: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 40: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 41: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCGG

C

T

A

Page 42: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCGG

C

T

A

Page 43: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

A

Page 44: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

A

Page 45: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 46: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCG

G

C

T

A

Page 47: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCGG

C

T

A

Page 48: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

AGTTCGG

C

T

A

Page 49: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.
Page 50: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Change Points

Page 51: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Approximate Vectors Between Change Points

Page 52: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Quantify Regions Between Change Points

• Trajectory Length– Tells the base count

• Vector Direction– Tells the relative frequencies of each base

• Vector Length vs. Trajectory Length– Tells how much the trajectory deviates from a straight

line

Page 53: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

DNA trajectories can be used to

• Match patterns by grouping similar vectors

• Find conserved regions (vectors that do not change from sequence to sequence)

• Perform many local alignments to assemble global alignments

Page 54: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Thanks!

• Kellar Autumn• Jeff Ely• Amanda Gassett• Deborah Lycan• Harvey Schmidt• Collin Trail• Greg Hermann• Matt Wilkinson

Page 55: Nick Heppenstall (biology) Michal Dvir (Mathematics/CS) Andrew Dittmore (physics) Under guidance of Dr. Yung-Pin Chen (Mathematics) DNA Sequence Comparison.

Work supported by

John S. Rogers Science Research Program