際際滷

際際滷Share a Scribd company logo
Paula Tataru
Conditional expectations
of sufficient statistics
for continuous time Markov chains
Joint work with Asger Hobolth
November 28, 2012
Motivation
Methods
Results
Paula
Tataru
2/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
A stochastic process {X(t) | t  0}
that fulfills the Markov property
Motivation
Methods
Results
Paula
Tataru
3/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Credit risk
 States: different types of ratings
 Chemical reactions
 States: different states of a molecule
 DNA data
 States:
 Nucleotides
 Amino acids
 Codons
Motivation
Methods
Results
Paula
Tataru
4/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
1st 2nd base 3rd
base T C A G base
T
TTT
Phe
TCT
Ser
TAT
Tyr
TGT
Cys
T
TTC TCC TAC TGC C
TTA
Leu
TCA TAA Stop TGA Stop A
TTG TCG TAG Stop TGG Trp G
C
CTT CCT
Pro
CAT
His
CGT
Arg
T
CTC CCC CAC CGC C
CTA CCA CAA
Gln
CGA A
CTG CCG CAG CGG G
A
ATT
Ile
ACT
Thr
AAT
Asn
AGT
Ser
T
ATC ACC AAC AGC C
ATA ACA AAA
Lys
AGA
Arg
A
ATG Met ACG AAG AGG G
G
GTT
Val
GCT
Ala
GAT
Asp
GGT
Gly
T
GTC GCC GAC GGC C
GTA GCA GAA
Glu
GGA A
GTG GCG GAG GGG G
Motivation
Methods
Results
Paula
Tataru
5/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
 Nucleotides (n = 4)
 Amino acids (n = 20)
 Codons (n = 64)
Motivation
Methods
Results
Paula
Tataru
6/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Characterized by a rate matrix
with properties
Motivation
Methods
Results
Paula
Tataru
7/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
 Nucleotides (n = 4)
 Jukes Cantor (JC)
Motivation
Methods
Results
Paula
Tataru
8/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Waiting time is exponentially distributed
 Jumps are discretely distributed
Motivation
Methods
Results
Paula
Tataru
9/30
Conditional
expectations
of sufficient
statistics
for CTMCs
General EM
Motivation
Methods
Results
Paula
Tataru
10/30
Conditional
expectations
of sufficient
statistics
for CTMCs
General EM
 Full log likelihood
 E-step
 M-step
Motivation
Methods
Results
Paula
Tataru
11/30
Conditional
expectations
of sufficient
statistics
for CTMCs
EM for Jukes Cantor
 Full log likelihood
 M-step
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
12/30
Calculating expectations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
13/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
14/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
15/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
16/30
Methods
 In Hobolth&Jensen (2010), three methods are
presented to evaluate the necessary statistics
 Eigenvalue decomposition (EVD)
 Uniformization (UNI)
 Exponentiation (EXPM)
 Extend efficiently to linear combinations
 Which method is more accurate?
 Which method is faster?
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
17/30
Eigenvalue decomposition
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
18/30
Uniformization
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
19/30
Uniformization
 Choose truncation point s(亮t)
 Bound error using the tail of Pois(m+1; 亮t)
 s(亮t) = 4 + 6 亮t + 亮t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
20/30
Exponentiation
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
21/30
Algorithms
EVD UNI EXPM*
1 A
Order
2 J(t)
Order
3 (C; t)裡 (C; t)裡 (C; t)裡
Order
Influenced by Q
, U, U -1
亮, s(亮t), R
O(n3
) O(n2
) O(n2
)
Rm
eAt
O(n2
) O(s(亮t)n3
) O(n3
)
O(n3
) O(s(亮t)n2
) O(n2
)
亮t 
* expm package in R
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
22/30
Accuracy
 Use two models for which (C; t) is available in
analytical form
 JC, varying n
 HKY, varying t
 Measure accuracy as the normalized difference:
approximation / true - 1
 as a function of n
 as a function of t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
23/30
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
24/30
Speed
EVD UNI EXPM*
1 A
Order
2 J(t)
Order
3 (C; t)裡 (C; t)裡 (C; t)裡
Order
Influenced by Q
, U, U -1
亮, s(亮t), R
O(n3
) O(n2
) O(n2
)
Rm
eAt
O(n2
) O(s(亮t)n3
) O(n3
)
O(n3
) O(s(亮t)n2
) O(n2
)
亮t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
25/30
Speed
 Partition of computation
 Precomputation
 EVD
 UNI
 Main computation
 Use three models and different time points to
asses the speed
 GY, n = 61
 GTR, n = 4
 UNR, n = 4
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
26/30
Speed
 10 equidistant time points
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
27/30
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
28/30
Results
 Accuracy
 Similar accuracy
 EXPM is the most accurate one
 Speed
 EXPM is the slowest
 EVD vs UNI
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
29/30
Results
 UNI has potential
 Better cut off point s(亮t)
 Ameliorate effect of 亮t by
 Adaptive uniformization
 On-the-fly variant of adaptive uniformization
Thank you!
Comparison of methods for calculating
conditional expectations of sufficient statistics
for continuous time Markov chains
BMC Bioinformatics 12(1):465, 2011

More Related Content

PaulaTataru