• date post

13-Jan-2016
• Category

## Documents

• view

36

1

Embed Size (px)

description

An Improved Search Algorithm for Optimal Multiple-Sequence Alignment. Paper by: Stefan Schroedl Presentation by: Bryan Franklin. Outline. Multiple-Sequence Alignment (MSA) Graph Representation Computing Path Costs Heuristics Experimental Results Other Optimizations. - PowerPoint PPT Presentation

### Transcript of An Improved Search Algorithm for Optimal Multiple-Sequence Alignment

• An Improved Search Algorithm for Optimal Multiple-Sequence AlignmentPaper by: Stefan SchroedlPresentation by: Bryan Franklin

• OutlineMultiple-Sequence Alignment (MSA)Graph RepresentationComputing Path CostsHeuristicsExperimental ResultsOther Optimizations

• Multiple-Sequence-AlignmentSequence:DNA: String over alphabet {A,C,G,T}Protein: String with ||=20 (one symbol for each amino acid)Alignment:Insert gaps (_) into sequences to line up matching characters.

• Multiple-Sequence-AlignmentSequences: ABCB, BCD, DB

• Multiple-Sequence-AlignmentIndel: Insertion, Deletion, Point mutation (single symbol replacement)Find a minimum set of indels between two or more sequences.NP-Hard for an arbitrary number of sequences

• Multiple-Sequence-AlignmentApplicationsCommon ancestry between speciesLocating useful portions of DNAPredicting structure of folded proteins

• Graph Representation

• Computing G(n)ConsiderationsBiological MeaningCost of computationSum-of-pairsSubstitution matrix ((||+1)2)Sum alignment costs for all pairsCosts can depend on neighbors

• Computing G(n)Sequences: ABCB, BCD, DB

• Computing G(n)67877

• Computing G(n)

• Heuristics

Methods Examined2-fold (hpair)divide-conquer3-fold (h3,all)4-fold (h4,all)

• Heuristic Comparison

• Resource Usage

• Optimizations

Sparse Path RepresentationCurve fitting for predicting threshold valuesSub-optimal paths periodically deleted

• The EndAny Questions?