An Improved Search Algorithm for Optimal Multiple-Sequence Alignment
date post
13-Jan-2016Category
Documents
view
36download
1
Embed Size (px)
description
Transcript of An Improved Search Algorithm for Optimal Multiple-Sequence Alignment
An Improved Search Algorithm for Optimal Multiple-Sequence AlignmentPaper by: Stefan SchroedlPresentation by: Bryan Franklin
OutlineMultiple-Sequence Alignment (MSA)Graph RepresentationComputing Path CostsHeuristicsExperimental ResultsOther Optimizations
Multiple-Sequence-AlignmentSequence:DNA: String over alphabet {A,C,G,T}Protein: String with ||=20 (one symbol for each amino acid)Alignment:Insert gaps (_) into sequences to line up matching characters.
Multiple-Sequence-AlignmentSequences: ABCB, BCD, DB
Multiple-Sequence-AlignmentIndel: Insertion, Deletion, Point mutation (single symbol replacement)Find a minimum set of indels between two or more sequences.NP-Hard for an arbitrary number of sequences
Multiple-Sequence-AlignmentApplicationsCommon ancestry between speciesLocating useful portions of DNAPredicting structure of folded proteins
Graph Representation
Computing G(n)ConsiderationsBiological MeaningCost of computationSum-of-pairsSubstitution matrix ((||+1)2)Sum alignment costs for all pairsCosts can depend on neighbors
Computing G(n)Sequences: ABCB, BCD, DB
Computing G(n)67877
Computing G(n)
Heuristics
Methods Examined2-fold (hpair)divide-conquer3-fold (h3,all)4-fold (h4,all)
Heuristic Comparison
Resource Usage
Optimizations
Sparse Path RepresentationCurve fitting for predicting threshold valuesSub-optimal paths periodically deleted
The EndAny Questions?