際際滷

際際滷Share a Scribd company logo
Paula Tataru
Conditional expectations
of sufficient statistics
for continuous time Markov chains
Joint work with Asger Hobolth
November 28, 2012
Motivation
Methods
Results
Paula
Tataru
2/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
A stochastic process {X(t) | t  0}
that fulfills the Markov property
Motivation
Methods
Results
Paula
Tataru
3/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Credit risk
 States: different types of ratings
 Chemical reactions
 States: different states of a molecule
 DNA data
 States:
 Nucleotides
 Amino acids
 Codons
Motivation
Methods
Results
Paula
Tataru
4/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
1st 2nd base 3rd
base T C A G base
T
TTT
Phe
TCT
Ser
TAT
Tyr
TGT
Cys
T
TTC TCC TAC TGC C
TTA
Leu
TCA TAA Stop TGA Stop A
TTG TCG TAG Stop TGG Trp G
C
CTT CCT
Pro
CAT
His
CGT
Arg
T
CTC CCC CAC CGC C
CTA CCA CAA
Gln
CGA A
CTG CCG CAG CGG G
A
ATT
Ile
ACT
Thr
AAT
Asn
AGT
Ser
T
ATC ACC AAC AGC C
ATA ACA AAA
Lys
AGA
Arg
A
ATG Met ACG AAG AGG G
G
GTT
Val
GCT
Ala
GAT
Asp
GGT
Gly
T
GTC GCC GAC GGC C
GTA GCA GAA
Glu
GGA A
GTG GCG GAG GGG G
Motivation
Methods
Results
Paula
Tataru
5/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
 Nucleotides (n = 4)
 Amino acids (n = 20)
 Codons (n = 64)
Motivation
Methods
Results
Paula
Tataru
6/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Characterized by a rate matrix
with properties
Motivation
Methods
Results
Paula
Tataru
7/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
 Nucleotides (n = 4)
 Jukes Cantor (JC)
Motivation
Methods
Results
Paula
Tataru
8/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Continuous-time Markov chains
 Waiting time is exponentially distributed
 Jumps are discretely distributed
Motivation
Methods
Results
Paula
Tataru
9/30
Conditional
expectations
of sufficient
statistics
for CTMCs
General EM
Motivation
Methods
Results
Paula
Tataru
10/30
Conditional
expectations
of sufficient
statistics
for CTMCs
General EM
 Full log likelihood
 E-step
 M-step
Motivation
Methods
Results
Paula
Tataru
11/30
Conditional
expectations
of sufficient
statistics
for CTMCs
EM for Jukes Cantor
 Full log likelihood
 M-step
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
12/30
Calculating expectations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
13/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
14/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
15/30
Linear combinations
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
16/30
Methods
 In Hobolth&Jensen (2010), three methods are
presented to evaluate the necessary statistics
 Eigenvalue decomposition (EVD)
 Uniformization (UNI)
 Exponentiation (EXPM)
 Extend efficiently to linear combinations
 Which method is more accurate?
 Which method is faster?
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
17/30
Eigenvalue decomposition
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
18/30
Uniformization
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
19/30
Uniformization
 Choose truncation point s(亮t)
 Bound error using the tail of Pois(m+1; 亮t)
 s(亮t) = 4 + 6 亮t + 亮t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
20/30
Exponentiation
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
21/30
Algorithms
EVD UNI EXPM*
1 A
Order
2 J(t)
Order
3 (C; t)裡 (C; t)裡 (C; t)裡
Order
Influenced by Q
, U, U -1
亮, s(亮t), R
O(n3
) O(n2
) O(n2
)
Rm
eAt
O(n2
) O(s(亮t)n3
) O(n3
)
O(n3
) O(s(亮t)n2
) O(n2
)
亮t 
* expm package in R
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
22/30
Accuracy
 Use two models for which (C; t) is available in
analytical form
 JC, varying n
 HKY, varying t
 Measure accuracy as the normalized difference:
approximation / true - 1
 as a function of n
 as a function of t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
23/30
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
24/30
Speed
EVD UNI EXPM*
1 A
Order
2 J(t)
Order
3 (C; t)裡 (C; t)裡 (C; t)裡
Order
Influenced by Q
, U, U -1
亮, s(亮t), R
O(n3
) O(n2
) O(n2
)
Rm
eAt
O(n2
) O(s(亮t)n3
) O(n3
)
O(n3
) O(s(亮t)n2
) O(n2
)
亮t
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
25/30
Speed
 Partition of computation
 Precomputation
 EVD
 UNI
 Main computation
 Use three models and different time points to
asses the speed
 GY, n = 61
 GTR, n = 4
 UNR, n = 4
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
26/30
Speed
 10 equidistant time points
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
27/30
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
28/30
Results
 Accuracy
 Similar accuracy
 EXPM is the most accurate one
 Speed
 EXPM is the slowest
 EVD vs UNI
Motivation
Methods
Results
Paula
Tataru
Conditional
expectations
of sufficient
statistics
for CTMCs
29/30
Results
 UNI has potential
 Better cut off point s(亮t)
 Ameliorate effect of 亮t by
 Adaptive uniformization
 On-the-fly variant of adaptive uniformization
Thank you!
Comparison of methods for calculating
conditional expectations of sufficient statistics
for continuous time Markov chains
BMC Bioinformatics 12(1):465, 2011

More Related Content

Similar to PaulaTataru (8)

皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
Victor Asanza
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
Gianluca Bontempi
Signal fundamentals
Signal fundamentalsSignal fundamentals
Signal fundamentals
Lalit Kanoje
rbs - presentation about applications of machine learning.
rbs - presentation about applications of machine learning.rbs - presentation about applications of machine learning.
rbs - presentation about applications of machine learning.
ChellamuthuMech
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series DataToeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Daiki Tanaka
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
Dragan Sahpaski
presentation_btp
presentation_btppresentation_btp
presentation_btp
Anshul Goyal, EIT
Continuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event SystemsContinuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event Systems
ahmad bassiouny
皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
皚皚皚皚皚 Finding a Dynamical Model of a Social Norm Physical Activity Intervention
Victor Asanza
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
Gianluca Bontempi
Signal fundamentals
Signal fundamentalsSignal fundamentals
Signal fundamentals
Lalit Kanoje
rbs - presentation about applications of machine learning.
rbs - presentation about applications of machine learning.rbs - presentation about applications of machine learning.
rbs - presentation about applications of machine learning.
ChellamuthuMech
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series DataToeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Daiki Tanaka
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
Dragan Sahpaski
Continuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event SystemsContinuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event Systems
ahmad bassiouny

More from Paula Tataru (18)

PhDretreat2014
PhDretreat2014PhDretreat2014
PhDretreat2014
Paula Tataru
PaulaTataru_PhD_defense
PaulaTataru_PhD_defensePaulaTataru_PhD_defense
PaulaTataru_PhD_defense
Paula Tataru
birc-csd2012
birc-csd2012birc-csd2012
birc-csd2012
Paula Tataru
TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014
Paula Tataru
AB-RNA-Mfold&SCFGs-2011
AB-RNA-Mfold&SCFGs-2011AB-RNA-Mfold&SCFGs-2011
AB-RNA-Mfold&SCFGs-2011
Paula Tataru
AB-RNA-comparison-2011
AB-RNA-comparison-2011AB-RNA-comparison-2011
AB-RNA-comparison-2011
Paula Tataru
AB-RNA-alignments-2011
AB-RNA-alignments-2011AB-RNA-alignments-2011
AB-RNA-alignments-2011
Paula Tataru
AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011
Paula Tataru
AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010
Paula Tataru
AB-RNA-SCFG-2010
AB-RNA-SCFG-2010AB-RNA-SCFG-2010
AB-RNA-SCFG-2010
Paula Tataru
AB-RNA-alignments-2010
AB-RNA-alignments-2010AB-RNA-alignments-2010
AB-RNA-alignments-2010
Paula Tataru
AB-RNA-Nus-2010
AB-RNA-Nus-2010AB-RNA-Nus-2010
AB-RNA-Nus-2010
Paula Tataru
PaulaTataruVienna
PaulaTataruViennaPaulaTataruVienna
PaulaTataruVienna
Paula Tataru
PaulaTataruCSHL
PaulaTataruCSHLPaulaTataruCSHL
PaulaTataruCSHL
Paula Tataru
PaulaTataruAarhus
PaulaTataruAarhusPaulaTataruAarhus
PaulaTataruAarhus
Paula Tataru
mgsa_poster
mgsa_postermgsa_poster
mgsa_poster
Paula Tataru
PaulaTataruOxford
PaulaTataruOxfordPaulaTataruOxford
PaulaTataruOxford
Paula Tataru
Mols_August2013
Mols_August2013Mols_August2013
Mols_August2013
Paula Tataru
PaulaTataru_PhD_defense
PaulaTataru_PhD_defensePaulaTataru_PhD_defense
PaulaTataru_PhD_defense
Paula Tataru
TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014TreeOfLife-jeopardy-2014
TreeOfLife-jeopardy-2014
Paula Tataru
AB-RNA-Mfold&SCFGs-2011
AB-RNA-Mfold&SCFGs-2011AB-RNA-Mfold&SCFGs-2011
AB-RNA-Mfold&SCFGs-2011
Paula Tataru
AB-RNA-comparison-2011
AB-RNA-comparison-2011AB-RNA-comparison-2011
AB-RNA-comparison-2011
Paula Tataru
AB-RNA-alignments-2011
AB-RNA-alignments-2011AB-RNA-alignments-2011
AB-RNA-alignments-2011
Paula Tataru
AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011AB-RNA-Nussinov-2011
AB-RNA-Nussinov-2011
Paula Tataru
AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010AB-RNA-SCFGdesign=2010
AB-RNA-SCFGdesign=2010
Paula Tataru
AB-RNA-SCFG-2010
AB-RNA-SCFG-2010AB-RNA-SCFG-2010
AB-RNA-SCFG-2010
Paula Tataru
AB-RNA-alignments-2010
AB-RNA-alignments-2010AB-RNA-alignments-2010
AB-RNA-alignments-2010
Paula Tataru
AB-RNA-Nus-2010
AB-RNA-Nus-2010AB-RNA-Nus-2010
AB-RNA-Nus-2010
Paula Tataru
PaulaTataruVienna
PaulaTataruViennaPaulaTataruVienna
PaulaTataruVienna
Paula Tataru
PaulaTataruCSHL
PaulaTataruCSHLPaulaTataruCSHL
PaulaTataruCSHL
Paula Tataru
PaulaTataruAarhus
PaulaTataruAarhusPaulaTataruAarhus
PaulaTataruAarhus
Paula Tataru
PaulaTataruOxford
PaulaTataruOxfordPaulaTataruOxford
PaulaTataruOxford
Paula Tataru
Mols_August2013
Mols_August2013Mols_August2013
Mols_August2013
Paula Tataru

PaulaTataru