This document discusses methods for calculating conditional expectations of sufficient statistics for continuous time Markov chains. It presents three methods - eigenvalue decomposition, uniformization, and exponentiation - and evaluates their accuracy and speed for different models and time points. For accuracy, it finds that exponentiation is most accurate. For speed, exponentiation is slowest while eigenvalue decomposition and uniformization trade off accuracy and speed depending on the model and time points. Uniformization has potential to be improved through better cutoff points or adaptive uniformization techniques.
4. Motivation
Methods
Results
Paula
Tataru
4/30
Conditional
expectations
of sufficient
statistics
for CTMCs
Modeling DNA data
1st 2nd base 3rd
base T C A G base
T
TTT
Phe
TCT
Ser
TAT
Tyr
TGT
Cys
T
TTC TCC TAC TGC C
TTA
Leu
TCA TAA Stop TGA Stop A
TTG TCG TAG Stop TGG Trp G
C
CTT CCT
Pro
CAT
His
CGT
Arg
T
CTC CCC CAC CGC C
CTA CCA CAA
Gln
CGA A
CTG CCG CAG CGG G
A
ATT
Ile
ACT
Thr
AAT
Asn
AGT
Ser
T
ATC ACC AAC AGC C
ATA ACA AAA
Lys
AGA
Arg
A
ATG Met ACG AAG AGG G
G
GTT
Val
GCT
Ala
GAT
Asp
GGT
Gly
T
GTC GCC GAC GGC C
GTA GCA GAA
Glu
GGA A
GTG GCG GAG GGG G
30. Thank you!
Comparison of methods for calculating
conditional expectations of sufficient statistics
for continuous time Markov chains
BMC Bioinformatics 12(1):465, 2011