Towards Identifying Lateral Gene Transfer Events

16
Towards Identifying Towards Identifying Lateral Gene Transfer Lateral Gene Transfer Events Events L. Addario-Berry, M. Hallett, J. Lagergren Presented By: Jeff Mathew

description

Towards Identifying Lateral Gene Transfer Events. L. Addario -Berry, M. Hallett , J. Lagergren Presented By: Jeff Mathew. Roadmap. Key terms τ -transfer problem H-moves and I-moves algorithm Tree generation for simulation Experimental results Conclusions and future work. LGT = HGT - PowerPoint PPT Presentation

Transcript of Towards Identifying Lateral Gene Transfer Events

Page 1: Towards Identifying Lateral Gene Transfer Events

Towards Identifying Towards Identifying Lateral Gene Transfer Lateral Gene Transfer

EventsEventsL. Addario-Berry, M. Hallett, J. Lagergren

Presented By: Jeff Mathew

Page 2: Towards Identifying Lateral Gene Transfer Events

RoadmapRoadmapKey termsτ-transfer problemH-moves and I-moves algorithmTree generation for simulationExperimental resultsConclusions and future work

Page 3: Towards Identifying Lateral Gene Transfer Events

Lateral transfer scenarioLateral transfer scenarioLGT = HGTRoot of scenario tree must correspond

to root of gene treeThe scenario tree is connected and

respects the direction of evolution implied by the arcs of T and S.

Page 4: Towards Identifying Lateral Gene Transfer Events

αα-activity-activityAn α-active scenario for a gene tree and

species tree allows at most alpha copies of a gene to simultaneously exist in the genome of an ancestral taxon.

Authors focus on 1-active scenarios though intractability results have been proved earlier for α ≥ 1.

Page 5: Towards Identifying Lateral Gene Transfer Events

ττ-transfer problem-transfer problemInput: Species tree S, gene tree T, integer τ

Output: A τ* lateral transfer scenario for S and T, τ* ≤ τ

Intractability result◦ The decision version of the α-Active, τ-Transfer

Problem (does there exist a α-active scenario with cost ≤ τ?) is NP-complete.

τ is the number of lateral transfer events needed to explain the difference between S and T

Page 6: Towards Identifying Lateral Gene Transfer Events

AlgorithmAlgorithm2 Phase approachPhase 1

◦While H-fat or I-fat vertices remain Perform H-fat move or I-fat move

At the end of phase 1, we are guaranteed that the scenario is 1-active. What about cycles?

Phase 2◦Remove minimum number of LGT

events from each candidate to make it acyclic.

Running Time: 24τ n2

Page 7: Towards Identifying Lateral Gene Transfer Events

Simulating species treesSimulating species treesCreate random species tree S on

n-leaves. Θ(log n) expected depthS is supposed to reflect the

actual evolutionary relationships between taxa◦S is ultrametric. Therefore, edge-

weights correspond to time.◦Randomly assign weights to every

edge such that every root-to-leaf path has weighted sum 1.

Page 8: Towards Identifying Lateral Gene Transfer Events

Simulating gene treesSimulating gene treesBegin with generated ultrametric species

treeLateral transfer events occur according to a

Poisson process with mean rate λMoving from root to leaves, for each vertex

x0 with children x1 and x2, examine both edges◦ If the Poisson process provides us with a lateral

transfer event along (x0, x1), we add it and point it to a randomly chosen edge alive at that point in time.

◦ Else add a speciation event for x1◦ Repeat the analysis for (x0, x2)

Page 9: Towards Identifying Lateral Gene Transfer Events

Degenerate CasesDegenerate CasesSimulation can result in plausible

biological events that are not detectable by the algorithm.

Useless transfers: LGTs that don’t change the gene tree

Transfer-loss events: One child of a node is a LGT event. Another child is a loss event.

Page 10: Towards Identifying Lateral Gene Transfer Events

ResultsResults Ω = number of repetitions τ = true number of LGT events τ‘ = minimum cost LGT scenario found by algorithm λ = mean rate of LGTs from Poisson process

Page 11: Towards Identifying Lateral Gene Transfer Events

Finding the saturation Finding the saturation pointpointThe point when the average τ‘ stops

increasing.Random trees from a large pool were

chosen as gene trees and species trees◦Trials suggest that saturation point is slightly

above n/2, i.e., when τ > n/2, the algorithms stops detecting new LGT events

Thus, if τ’ > n/2, the correspondence between T and S via LGT events is not very meaningful.

Page 12: Towards Identifying Lateral Gene Transfer Events

ResultsResults Ω = number of repetitions τ = true number of LGT events τ‘ = minimum cost LGT scenario found by algorithm λ = mean rate of LGTs from Poisson process

Page 13: Towards Identifying Lateral Gene Transfer Events

ResultsResults Ω = number of repetitions τ = true number of LGT events τ‘ = minimum cost LGT scenario found by algorithm λ = mean rate of LGTs from Poisson process

Page 14: Towards Identifying Lateral Gene Transfer Events

ResultsResults Ω = number of repetitions τ = true number of LGT events τ‘ = minimum cost LGT scenario found by algorithm λ = mean rate of LGTs from Poisson process

Page 15: Towards Identifying Lateral Gene Transfer Events

ConclusionsConclusionsEmpirically verified feasibility of the τ-

transfer algorithm Degenerate events such as transfer-

loss events that result in over-estimates of transfers occur with low probability

Achieved near-optimal scenarios when λ is low enough not to cause saturation

The cycle elimination phase of the algorithm is extremely rare in practice implying a O(22τ n2) running time.

Page 16: Towards Identifying Lateral Gene Transfer Events

Future work and open Future work and open problemsproblemsUse weighted gene trees and species

trees◦ Species trees are nearly ultra-metric while

gene trees are notDo fast algorithms exist when the input is

a set of gene trees with no species tree?Tractability on larger phylogeniesCan we consider gene duplication, lateral

gene transfers, and other events simultaneously?

Can we use probabilistic models that assign likelihood events to various events and optimize over such models in a tractable manner?