Class 8: Pair HMMs
description
Transcript of Class 8: Pair HMMs
.
Class 8:
Pair HMMs
FSA HHMs: Why?
Advantages:
Obtain reliability of alignment
Explore alternative (sub-optimal) alignments Score similarity of sequences independent of
any specific alignment
FSA HHMs
B(+1,+0)
A(+1,+1)
C(+0,+1)
Ws
Wg+W
s
Wg+Ws
Ws
s(si,tj)
s(si,tj)
s(si,tj)
Bqsi
Apsitj
Cqtj
ε
ε1-ε
1-ε
δ
δ
1-2δ
Affine gap alignment: the full probabilistic model
Bqsi
Apsitj
Cqtj
ε
1-ε-τ
δ
δ1-2δ-τ1-ε-τ
δ
τ
τ
τ
τ
Begin End
1-2δ-τ
δε
Affine Weight Model – DP
( 1, 1) ( , )
( , ) max ( 1, 1) ( , )
( 1, 1) ( , )
i j
i j
i j
A i j s t
A i j B i j s t
C i j s t
( , 1)( , ) max
( , 1)g s
s
A i j W WB i j
B i j W
( 1, )( , ) max
( 1, )g s
s
A i j W WC i j
C i j W
B(+1,+0)
A(+1,+1)
C(+0,+1)
Ws
Wg+W
s
Wg+Ws
Ws
s(si,tj)
s(si,tj)
s(si,tj)
Viterbi in Pair-HMM
Finding the most probable sequence of hidden states is exactly the global sequence alignment
1, 1
, 1, 1
1, 1
(1 2 ) ( )
( ) max (1 ) ( )
(1 ) ( )i j
i j
i j s t i j
i j
v A
V A p v B
v C
1,
,1,
( )( ) max
( )i
i j
i j si j
v AV B q
v B
, 1,
, 1
( )( ) max
( )j
i ji j t
i j
v AV C q
v C
Bqsi
Apsitj
Cqtj ε
1-ε-τ
δ
δ1-2δ-τ1-ε-τ
δ
τ
τ
τ
τ
Begin End
1-2δ-τ
δε
Viterbi in Pair-HMM
Initial condition:
Optimal alignment:
Bqsi
Apsitj
Cqtj ε
1-ε-τ
δ
δ1-2δ-τ1-ε-τ
δ
τ
τ
τ
τ
Begin End
1-2δ-τ
δε
0,0
,0 0,
( ) 1
all other (*) (*) set to 0i j
V A
V V
,
,
,
( )
( ) max ( )
( )
n m
n m
n m
V A
V End V B
V C
Pair-HMM for random model
sqsi
tqtj
η
η1-η
1-η η
1-η η
1-η
Begin End
1 1
2
1 1
( , | ) (1 ) (1 )
(1 )
i j
i j
n mn m
s ti j
n mn m
s ti j
p s t R q q
q q
Pair-HMM for local alignment
Rs1qsi
Rt1qtj
1-η
1-η
1-η
1-η
ηη
η
η
Begin
Bqsi
Apsitj
Cqtj ε
1-ε-τ
δ
δ1-2δ-τ1-ε-τ
δ
τ
τ
τ
τ
1-2δ-τ
δε
1-η
Rs2qsi
Rt2qtj
1-η
1-η
1-η
ηη
η
η
End
The full probability: P(s,t)
alignments
( , ) )P s t P(s,t,
Use the “forward” algorithm:
The posterior probability:
,( , ) ( )n mP s t f End
( , , )( | , )
( , )
P s tP s t
P s t
Suboptimal alignments
Suboptimal alignments: alignments with nearly the same score as the best alignment
Only slightly different from the optimal alignment
Substantially or completely different
Probabilistic sampling
From the forward algorithm:
Choose the next step to be:
, 1, 1 1, 1 1, 1( ) [(1 2 ) ( ) (1 ) ( ) ( )]i ji j s t i j i j i jf A p f A f B f C
1, 1
,
1, 1
,
1, 1
,
(1 2 ) ( )( 1, 1) with prob.
( )
(1 ) ( )( 1, 1) with prob.
( )
(1 ) ( )( 1, 1) with prob.
( )
i j
i j
i j
s t i j
i j
s t i j
i j
s t i j
i j
p f AA i j
f A
p f BB i j
f A
p f CC i j
f A
Probabilistic sampling – example
s HEAGAWGHEE
t PAWHEAE
Distinct suboptimal alignments
Waterman and Eggert [1987]
1... 1... 1... 1... 1... 1...
1... 1... 1... 1...
, ,
( , , )( | , )
( , )
( , , )
( , , ) ( , | , , )
( , , ) ( , | )
( ) ( )
i ji j
i j
i j i j i n j m i j i j
i j i j i n j m i j
i j i j
P s t s tP s t s t
P s t
P s t s t
P s t s t P s t s t s t
P s t s t P s t s t
f A b A
( | , )i jP s t s tThe posterior probability