This document provides an overview of hidden Markov models including:
- The key elements of HMMs such as states, observations, transition probabilities, and emission probabilities.
- The three basic problems of HMMs including computing the probability of an observation sequence, finding the optimal state sequence, and estimating model parameters.
- Algorithms for solving the first problem including the forward algorithm which computes the probability of an observation sequence in linear time.
1 of 31
Download to read offline
More Related Content
1999 hidden markov models
1. ' HMM Tutorial c Tapas Kanungo 1
$
Hidden Markov Models
Tapas Kanungo
Center for Automation Research
University of Maryland
Web: www.cfar.umd.edu ~kanungo
Email: kanungo@cfar.umd.edu
&
3. ' HMM Tutorial c Tapas Kanungo 3
$
Markov Models
Observable states:
1; 2; : : : ; N
Observed sequence:
q1; q2; : : : ; q ; : : : ; q
t T
First order Markov assumption:
P q = j jq ,1 = i; q ,2 = k; : : : = P q = j jq ,1 = i
t t t t t
Stationarity:
P q = j jq ,1 = i = P q + = j jq + ,1 = i
t t t l t l
4. ' HMM Tutorial c Tapas Kanungo 4
$
Markov Models
State transition matrix A :
2 3
6 a11 a12 a1
6 a1 7
7
6
6
j N
7
7
6 7
6 21 a22 a2
6 a
6 a2 7
7
7
6 j N
7
6 .
6
6 . .. .. .. 7
7
6 7
7
A=6 6
6
7
7
7
6 1 a 2 a
6 a
6 a 7
7
7
6 i i ij iN
7
6 .
6
6 . .. .. .. 7
7
6
6
6
7
7
7
7
6 7
1 a 2 a
4a a 5
N N Nj NN
where
a = P q = j jq ,1 = i
ij t t 1 i; j; N
Constraints on a : ij
a 0; ij 8i; j
X
a = 1; 8i
N
=1
ij
j
5. ' HMM Tutorial c Tapas Kanungo 5
$
Markov Models: Example
States:
1. Rainy R
2. Cloudy C
3. Sunny S
State transition probability matrix:
2 3
6 0:4
6 0:3 0:3 7
7
6
6 7
7
A = 6 0:2
6
6
6 0:6
7
0:2 7
7
7
6
6 7
7
4 5
0:1 0:1 0:8
Compute the probability of
observing SSRRSCS given that today is S .
6. ' HMM Tutorial c Tapas Kanungo 6
$
Markov Models: Example
Basic conditional probability rule:
P A; B = P AjB P B
The Markov chain rule:
P q1; q2; : : : ; q
T
= P q jq1; q2 ; : : : ; q ,1P q1 ; q2; : : : ; q ,1
T T T
= P q jq ,1P q1 ; q2; : : : ; q ,1
T T T
= P q jq ,1P q ,1jq ,2P q1 ; q2; : : : ; q ,2
T T T T T
= P q jq ,1P q ,1jq ,2 P q2 jq1 P q1
T T T T
7. ' HMM Tutorial c Tapas Kanungo 7
$
Markov Models: Example
Observation sequence O :
O = S; S; S; R; R; S; C; S
Using the chain rule we get:
P Ojmodel
= P S; S; S; R; R; S; C; S jmodel
= P S P S jS P S jS P RjS P RjR
P S jRP C jS P S jC
= 3a33 a33 a31a11 a13 a32 a23
= 10:82 0:10:40:30:10:2
= 1:536 10,4
The prior probability = P q1 = i
i
8. ' HMM Tutorial c Tapas Kanungo 8
$
Markov Models: Example
What is the probability that the sequence re-
mains in state i for exactly d time units?
i 6
p d = P q1 = i; q2 = i; : : : ; q = i; q +1 = i; : : :d d
= a ,1 1 , a
i ii
d
ii
Exponential Markov chain duration density.
What is the expected value of the duration d in
state i?
1
X
d = i dp d
i
=1
1
d
X
= da ,11 , a
ii
d
ii
=1
1
d
X
= 1 , a ii da ,1 ii
d
=1
@ X a
d
1
= 1 , a d
@a 0=1 1ii ii
@ a A
ii d
= 1 , a @ ii
@a 1 , aii
1
ii ii
=
1,a
ii
9. ' HMM Tutorial c Tapas Kanungo 9
$
Markov Models: Example
Avg. number of consecutive sunny days =
1 1
= =5
1 , a33 1 , 0:8
Avg. number of consecutive cloudy days = 2.5
Avg. number of consecutive rainy days = 1.67
10. ' HMM Tutorial c Tapas Kanungo 10
$
Hidden Markov Models
States are not observable
Observations are probabilistic functions of state
State transitions are still probabilistic
11. ' HMM Tutorial c Tapas Kanungo 11
$
Urn and Ball Model
N urns containing colored balls
M distinct colors of balls
Each urn has a possibly di erent distribution
of colors
Sequence generation algorithm:
1. Pick initial urn according to some random
process.
2. Randomly pick a ball from the urn and then
replace it
3. Select another urn according a random selec-
tion process associated with the urn
4. Repeat steps 2 and 3
12. ' HMM Tutorial c Tapas Kanungo 12
$
The Trellis
N
STATES
4
3
2
1
1 2 t-1 t t+1 t+2 T-1 T
TIME
o1 o2 o t-1 ot o t+1 o t+2 o T-1 oT
OBSERVATION
13. ' HMM Tutorial c Tapas Kanungo 13
$
Elements of Hidden Markov
Models
N the number of hidden states
Q set of states Q = f1; 2; : : : ; N g
M the number of symbols
V set of symbols V = f1; 2; : : : ; M g
A the state-transition probability matrix.
a = P q +1 = j jq = i 1 i; j; N
ij t t
B Observation probability distribution:
B k = P o = kjq = j 1 k M
j t t
the initial state distribution:
= P q1 = i 1 i N
i
the entire model = A; B;
14. ' HMM Tutorial c Tapas Kanungo 14
$
Three Basic Problems
1. Given observation O = o1; o2; : : : ; o and model
T
= A; B; ; e ciently compute P Oj:
Hidden states complicate the evaluation
Given two models 1 and 2, this can be used
to choose the better one.
2. Given observation O = o1; o2; : : : ; o and model
T
nd the optimal state sequence q = q1; q2; : : : ; q : T
Optimality criterion has to be decided e.g.
maximum likelihood
Explanation for the data.
3. Given O = o1; o2; : : : ; o ; estimate model parame-
T
ters = A; B; that maximize P Oj:
15. ' HMM Tutorial c Tapas Kanungo 15
$
Solution to Problem 1
Problem: Compute P o1; o2; : : : ; o j T
Algorithm:
Let q = q1; q2; : : : ; q be a state sequence.
T
Assume the observations are independent:
Y
P Ojq; = P o jq ;
T
=1
t t
i
= b o1 b o2 b T o
q1 q2 q T
Probability of a particular state sequence is:
P qj = a a a T ,
q1 q1 q2 q2 q3 q 1 qT
Also, P O; qj = P Ojq; P qj
Enumerate paths and sum probabilities:
P Oj = X P Ojq; P qj
q
N state sequences and OT calculations.
T
Complexity: OTN calculations.
T
16. ' HMM Tutorial c Tapas Kanungo 16
$
Forward Procedure: Intuition
N
aNk
STATES
a3k
3 k
2
a1K
1
t t+1
TIME
17. ' HMM Tutorial c Tapas Kanungo 17
$
Forward Algorithm
De ne forward variable i as: t
t i = P o1 ; o2; : : : ; o ; q = ij
t t
t is the probability of observing the partial
i
sequence o1; o2 : : : ; o such that the state q is i.
t t
Induction:
1. Initialization: 1i = b o1 i i
2. Induction:
2 3
X
+1j = 4 ia 5b o +1
N
=1
t t ij j t
i
3. Termination:
X
P Oj = i
N
=1
T
i
Complexity: ON 2T :
18. ' HMM Tutorial c Tapas Kanungo 18
$
Example
Consider the following coin-tossing experiment:
State 1 State 2 State 3
PH 0.5 0.75 0.25
PT 0.5 0.25 0.75
state-transition probabilities equal to 1 3
initial state probabilities equal to 1 3
1. You observe O = H; H; H; H; T; H; T; T; T; T : What
state sequence, q; is most likely? What is the
joint probability, P O; qj; of the observation se-
quence and the state sequence?
2. What is the probability that the observation se-
quence came entirely of state 1?
19. ' HMM Tutorial c Tapas Kanungo 19
$
3. Consider the observation sequence
~
O = H; T; T; H; T; H; H; T; T; H :
How would your answers to parts 1 and 2 change?
4. If the state transition probabilities were:
2 3
6 0:9
6 0:45 0:45 7
7
6 7
0=6
A 6 0:05
7
7
6
6
6 0:1 0:45 7 ;
7
7
6
6 7
7
4 5
0:05 0:45 0:1
how would the new model 0 change your answers
to parts 1-3?
21. ' HMM Tutorial c Tapas Kanungo 21
$
Backward Algorithm
De ne backward variable i as: t
t i = P o +1; o +2; : : : ; o jq = i;
t t T t
t is the probability of observing the partial
i
sequence o +1; o +2 : : : ; o such that the state q is
t t T t
i.
Induction:
1. Initialization: i = 1 T
2. Induction:
X
i = a b o +1 +1j ;
N
=1
t ij j t t
j
1 i N;
t = T , 1; : : : ; 1
22. ' HMM Tutorial c Tapas Kanungo 22
$
Solution to Problem 2
Choose the most likely path
Find the path q1; q ; : : : ; q that maximizes the
t T
likelihood:
P q1; q2; : : : ; q jO; T
Solution by Dynamic Programming
De ne:
t i = 1 maxt,1 P q1 ; q2; : : : ; q = i; o1; o2 ; : : : ; o j
2
q ;q ;:::;q
t t
t i is the highest prob. path ending in state i
By induction we have:
t+1j = max i
t ia b o +1
ij j t
23. ' HMM Tutorial c Tapas Kanungo 23
$
Viterbi Algorithm
N 1111111
0000000
1111111
0000000
aNk
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000 k
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
STATES
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
4 1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
3 1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
a1k
1111111
0000000
2 1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1 1111111
0000000
1111111
0000000
1 2 t-1 t t+1 t+2 T-1 T
TIME
o1 o2 o t-1 ot o t+1 o t+2 o T-1 oT
OBSERVATION
24. ' HMM Tutorial c Tapas Kanungo 24
$
Viterbi Algorithm
Initialization:
1 i = b o1;
i i 1iN
1i = 0
Recursion:
tj = 1max ,1ia b o
i N
t ij j t
j = arg 1max ,1ia
t
i N
t ij
2 t T; 1 j N
Termination:
P = 1max i
i N
T
q = arg 1max i
T
i N
T
Path state sequence backtracking:
q = +1q+1; t = T , 1; T , 2; : : : ; 1
t t t
25. ' HMM Tutorial c Tapas Kanungo 25
$
Solution to Problem 3
Estimate = A; B; to maximize P Oj
No analytic method because of complexity it-
erative solution.
Baum-Welch Algorithm:
1. Let initial model be 0:
2. Compute new based on 0 and observation
O:
3. If log P Oj , log P Oj0 DELTA stop.
4. Else set 0 and goto step 2.
26. ' HMM Tutorial c Tapas Kanungo 26
$
Baum-Welch: Preliminaries
De ne i; j as the probability of being in state
i at time t and in state j at time t + 1:
ia b o +1 +1j
i; j = t ij
P Oj
j t t
ia b o +1 +1j
= P P t ij j t t
=1 =1 ia b o +1 +1j
N N
i j t ij j t t
De ne i as probability of being in state i at
t
time t; given the observation sequence.
X
i = i; j
N
=1
t t
j
PT
t =1 tis the expected number of times state i
i
is visited.
P ,1
T
t =1 tis the expected number of transitions
i; j
from state i to state j:
27. ' HMM Tutorial c Tapas Kanungo 27
$
Baum-Welch: Update Rules
= expected frequency in state i at time t = 1
i
= 1i:
a = expected number of transition from state i
ij
to state j expected nubmer of transitions from
state i: P
a = P i;ij
ij
t
t
k = expected number of
b
j times in state j and
observing symbol k expected number of times
in state j : P
k = t =k j
bj
t;o
P
t
j
t
28. ' HMM Tutorial c Tapas Kanungo 28
$
Properties
Covariance of the estimated parameters
Convergence rates
29. ' HMM Tutorial c Tapas Kanungo 29
$
Types of HMM
Continuous density
Ergodic
State duration
31. ' HMM Tutorial c Tapas Kanungo 31
$
Comparison of HMMs
What is a natural distance function?
If 1; 2 is large, does it mean that the models
are really di erent?