ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
'                           HMM Tutorial c Tapas Kanungo 1
                                                             $
      Hidden Markov Models

         Tapas Kanungo
     Center for Automation Research
         University of Maryland
    Web: www.cfar.umd.edu ~kanungo
      Email: kanungo@cfar.umd.edu




&
'                                HMM Tutorial c Tapas Kanungo 2
                                                                  $
                   Outline

1. Markov models
2. Hidden Markov models
3. Forward Backward algorithm
4. Viterbi algorithm
5. Baum-Welch estimation algorithm
'                                                               HMM Tutorial c Tapas Kanungo 3
                                                                                                 $
                      Markov Models

    Observable states:
                                      1; 2; : : : ; N

    Observed sequence:
                              q1; q2; : : : ; q ; : : : ; q
                                               t            T




    First order Markov assumption:
      P q = j jq ,1 = i; q ,2 = k; : : : = P q = j jq ,1 = i
          t           t           t                             t          t




    Stationarity:
              P q = j jq ,1 = i = P q + = j jq + ,1 = i
                  t       t                        t    l       t   l
'                                                       HMM Tutorial c Tapas Kanungo 4
                                                                                         $
                 Markov Models

    State transition matrix A :
                  2                                               3
                  6 a11 a12    a1
                  6                                   a1       7
                                                                  7
                  6
                  6
                                            j                 N
                                                                  7
                                                                  7
                  6                                               7
                  6 21 a22    a2
                  6 a
                  6                                   a2       7
                                                                  7
                                                                  7
                  6                         j                 N
                                                                  7
                  6 .
                  6
                  6 .    ..    ..                         ..   7
                                                                  7
                  6                                            7
                                                                  7
             A=6  6
                  6
                                                                  7
                                                                  7
                                                                  7
                  6 1 a 2  a
                  6 a
                  6                                   a        7
                                                                  7
                                                                  7
                  6      i             i   ij                iN
                                                                  7
                  6 .
                  6
                  6 .     ..    ..                        ..   7
                                                                  7
                  6
                  6
                  6
                                                               7
                                                                  7
                                                                  7
                                                                  7
                  6                                               7
                      1 a 2  a               
                  4a                                    a         5
                         N             N   Nj             NN



    where
           a = P q = j jq ,1 = i
            ij      t                  t         1  i; j;  N

    Constraints on a :       ij




                          a  0;  ij            8i; j
                        X
                          a = 1;                8i
                         N



                         =1
                                  ij
                     j
'                                          HMM Tutorial c Tapas Kanungo 5
                                                                            $
       Markov Models: Example

    States:
    1. Rainy R
    2. Cloudy C 
    3. Sunny S 
    State transition probability matrix:
                         2             3
                         6 0:4
                         6     0:3 0:3 7
                                       7
                         6
                         6             7
                                       7
                     A = 6 0:2
                         6
                         6
                         6     0:6
                                       7
                                   0:2 7
                                       7
                                       7
                         6
                         6             7
                                       7
                         4             5
                           0:1 0:1 0:8
    Compute the probability of
    observing SSRRSCS given that today is S .
'                                                 HMM Tutorial c Tapas Kanungo 6
                                                                                   $
       Markov Models: Example

Basic conditional probability rule:
                   P A; B  = P AjB P B 

The Markov chain rule:
      P q1; q2; : : : ; q 
                       T



       = P q jq1; q2 ; : : : ; q ,1P q1 ; q2; : : : ; q ,1
               T                 T                    T



       = P q jq ,1P q1 ; q2; : : : ; q ,1
               T   T                     T



       = P q jq ,1P q ,1jq ,2P q1 ; q2; : : : ; q ,2
               T   T         T       T                     T



       = P q jq ,1P q ,1jq ,2    P q2 jq1 P q1 
               T   T         T       T
'                                           HMM Tutorial c Tapas Kanungo 7
                                                                             $
       Markov Models: Example

    Observation sequence O :
                  O = S; S; S; R; R; S; C; S 
    Using the chain rule we get:
          P Ojmodel
           = P S; S; S; R; R; S; C; S jmodel
           = P S P S jS P S jS P RjS P RjR 
                P S jRP C jS P S jC 
           = 3a33 a33 a31a11 a13 a32 a23
           = 10:82 0:10:40:30:10:2
           = 1:536  10,4

    The prior probability  = P q1 = i
                               i
'                                                                              HMM Tutorial c Tapas Kanungo 8
                                                                                                                $
        Markov Models: Example

    What is the probability that the sequence re-
    mains in state i for exactly d time units?
         i                                            6
       p d = P q1 = i; q2 = i; : : : ; q = i; q +1 = i; : : :d                  d



             =  a  ,1 1 , a 
                   i       ii
                                d
                                                  ii




    Exponential Markov chain duration density.
    What is the expected value of the duration d in
    state i?
                                    1
                                    X
                   
                   d = i                  dp d
                                            i
                                    =1
                                    1
                                    d

                                    X
                           =              da  ,11 , a 
                                                 ii
                                                      d
                                                                          ii
                                     =1
                                                       1
                                    d

                                                       X
                           = 1 , a         ii             da  ,1 ii
                                                                          d


                                                       =1
                                      @ X a 
                                                      d

                                         1
                           = 1 , a                                           d

                                     @a 0=1 1ii                           ii



                                      @    a A
                                                          ii d



                           = 1 , a  @                                   ii


                                     @a 1 , aii


                                        1
                                                          ii                   ii


                           =
                                    1,a

                                                                                                               
                                            ii
'                                              HMM Tutorial c Tapas Kanungo 9
                                                                                $
       Markov Models: Example

    Avg. number of consecutive sunny days =
                    1             1
                            =             =5
                  1 , a33       1 , 0:8
    Avg. number of consecutive cloudy days = 2.5
    Avg. number of consecutive rainy days = 1.67
'                                      HMM Tutorial c Tapas Kanungo 10
                                                                         $
         Hidden Markov Models

    States are not observable
    Observations are probabilistic functions of state
    State transitions are still probabilistic
'                                     HMM Tutorial c Tapas Kanungo 11
                                                                        $
            Urn and Ball Model

    N urns containing colored balls
    M distinct colors of balls
    Each urn has a possibly di erent distribution
    of colors
    Sequence generation algorithm:
    1. Pick initial urn according to some random
       process.
    2. Randomly pick a ball from the urn and then
       replace it
    3. Select another urn according a random selec-
       tion process associated with the urn
    4. Repeat steps 2 and 3
'                                                   HMM Tutorial c Tapas Kanungo 12
                                                                                      $
                           The Trellis


             N
    STATES




             4
             3
             2
             1
                 1    2    t-1       t     t+1     t+2      T-1        T
                                    TIME
                 o1   o2   o t-1     ot    o t+1   o t+2     o T-1      oT

                                   OBSERVATION
'                                            HMM Tutorial c Tapas Kanungo 13
                                                                               $
        Elements of Hidden Markov
                  Models

    N the number of hidden states
    Q set of states Q = f1; 2; : : : ; N g
    M the number of symbols
    V    set of symbols V = f1; 2; : : : ; M g
    A the state-transition probability matrix.
             a = P q +1 = j jq = i 1  i; j;  N
              ij        t       t




    B Observation probability distribution:
             B k = P o = kjq = j  1  k  M
               j            t       t




     the initial state distribution:
                    = P q1 = i 1  i  N
                    i




     the entire model  = A; B;
'                                        HMM Tutorial c Tapas Kanungo 14
                                                                           $
          Three Basic Problems

1. Given observation O = o1; o2; : : : ; o  and model
                                             T



    = A; B; ; e ciently compute P Oj:
      Hidden states complicate the evaluation
      Given two models 1 and 2, this can be used
      to choose the better one.
2. Given observation O = o1; o2; : : : ; o  and model 
                                         T



    nd the optimal state sequence q = q1; q2; : : : ; q :      T




      Optimality criterion has to be decided e.g.
      maximum likelihood
       Explanation for the data.
3. Given O = o1; o2; : : : ; o ; estimate model parame-
                           T



   ters  = A; B;  that maximize P Oj:
'                                                                 HMM Tutorial c Tapas Kanungo 15
                                                                                                    $
             Solution to Problem 1

    Problem: Compute P o1; o2; : : : ; o j                 T




    Algorithm:
          Let q = q1; q2; : : : ; q  be a state sequence.
                                T




          Assume the observations are independent:
                                    Y
                  P Ojq;  =   P o jq ; 
                                        T



                                        =1
                                                     t    t
                                    i


                            = b o1 b o2    b T o 
                                        q1           q2                 q      T




          Probability of a particular state sequence is:
                    P qj =  a a    a T ,
                                    q1       q1 q2   q2 q3          q   1 qT


          Also, P O; qj = P Ojq; P qj
          Enumerate paths and sum probabilities:
                     P Oj = X P Ojq; P qj
                                        q



    N state sequences and OT  calculations.
      T




    Complexity: OTN  calculations.
                                                                                                   
                            T
'                               HMM Tutorial c Tapas Kanungo 16
                                                                  $
    Forward Procedure: Intuition


                N
                          aNk
       STATES




                        a3k
                3                         k
                2
                          a1K
                1


                    t              t+1
                        TIME
'                                                                    HMM Tutorial c Tapas Kanungo 17
                                                                                                       $
               Forward Algorithm

    De ne forward variable i as:       t




                 t   i = P o1 ; o2; : : : ; o ; q = ij
                                                       t        t




     t  is the probability of observing the partial
         i
    sequence o1; o2 : : : ; o  such that the state q is i.
                               t                                                     t




    Induction:
    1. Initialization: 1i =  b o1   i       i




    2. Induction:
                                 2                              3
                                   X
                        +1j  = 4               ia           5b      o +1
                                    N



                                    =1
                       t                     t             ij       j     t
                                   i




    3. Termination:
                                                 X
                            P Oj =                           i
                                                  N



                                                  =1
                                                            T
                                              i




    Complexity: ON 2T :
'                                      HMM Tutorial c Tapas Kanungo 18
                                                                         $
                    Example

Consider the following coin-tossing experiment:
                  State 1 State 2 State 3
           PH 0.5          0.75 0.25
           PT 0.5          0.25 0.75
 state-transition probabilities equal to 1 3
 initial state probabilities equal to 1 3


1. You observe O = H; H; H; H; T; H; T; T; T; T : What
   state sequence, q; is most likely? What is the
   joint probability, P O; qj; of the observation se-
   quence and the state sequence?
2. What is the probability that the observation se-
   quence came entirely of state 1?
'                                         HMM Tutorial c Tapas Kanungo 19
                                                                            $
3. Consider the observation sequence
             ~
             O = H; T; T; H; T; H; H; T; T; H :
    How would your answers to parts 1 and 2 change?
4. If the state transition probabilities were:
                     2                   3
                     6 0:9
                     6        0:45  0:45 7
                                         7
                     6                   7
                   0=6
                  A 6 0:05
                                         7
                                         7
                     6
                     6
                     6         0:1  0:45 7 ;
                                         7
                                         7
                     6
                     6                   7
                                         7
                     4                   5
                         0:05 0:45 0:1
    how would the new model 0 change your answers
    to parts 1-3?
'                                                    HMM Tutorial c Tapas Kanungo 20
                                                                                       $
                      Backward Algorithm


             N                        1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                          aiN
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      111111
                                      000000
    STATES




                                      1111111
                                      0000000
                                      111111
                                      000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      111111
                                      000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      111111
                                      000000
                                     i0000000
                                      1111111
                                      111111
                                      000000
             4                        1111111
                                      0000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      111111
                                      000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
             3                        1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                        ai1
                                      1111111
                                      0000000
             2                        1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
                                      1111111
                                      0000000
             1                        1111111
                                      0000000
                                      1111111
                                      0000000
                 1     2    t-1       t     t+1     t+2      T-1        T
                                     TIME
                 o1    o2   o t-1     ot    o t+1   o t+2     o T-1      oT

                                    OBSERVATION
'                                                                    HMM Tutorial c Tapas Kanungo 21
                                                                                                       $
               Backward Algorithm

    De ne backward variable i as:                      t




                t   i = P o +1; o +2; : : : ; o jq = i; 
                                t           t                    T   t




     t  is the probability of observing the partial
         i
    sequence o +1; o +2 : : : ; o  such that the state q is
                    t       t               T                                               t



    i.
    Induction:
    1. Initialization: i = 1  T




    2. Induction:
                                    X
                            i =            a b o +1 +1j ;
                                        N



                                        =1
                        t                       ij   j       t       t
                                    j


                                               1  i  N;
                                               t = T , 1; : : : ; 1
'                                                                  HMM Tutorial c Tapas Kanungo 22
                                                                                                     $
                 Solution to Problem 2

    Choose the most likely path
    Find the path q1; q ; : : : ; q  that maximizes the
                                      t            T



    likelihood:
                  P q1; q2; : : : ; q jO;           T




    Solution by Dynamic Programming
    De ne:
          t   i = 1 maxt,1 P q1 ; q2; : : : ; q = i; o1; o2 ; : : : ; o j
                      2
                     q ;q ;:::;q
                                                           t                        t




    t   i is the highest prob.                  path ending in state i
    By induction we have:
                           t+1j  = max  i
                                              t   ia  b o +1
                                                       ij      j    t
'                                                   HMM Tutorial c Tapas Kanungo 23
                                                                                      $
                      Viterbi Algorithm


             N                      1111111
                                    0000000
                                    1111111
                                    0000000
                                       aNk
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000 k
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
    STATES




                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
             4                      1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
             3                      1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                        a1k
                                    1111111
                                    0000000
             2                      1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
                                    1111111
                                    0000000
             1                      1111111
                                    0000000
                                    1111111
                                    0000000
                 1    2    t-1       t     t+1     t+2      T-1        T
                                    TIME
                 o1   o2   o t-1     ot    o t+1   o t+2     o T-1      oT

                                   OBSERVATION
'                                                                           HMM Tutorial c Tapas Kanungo 24
                                                                                                              $
                Viterbi Algorithm

    Initialization:
                      1 i   =  b o1;
                                      i   i                   1iN
                     1i = 0
    Recursion:
                 tj  = 1max ,1ia b o 
                                i   N
                                              t               ij       j    t



                 j  = arg 1max ,1ia
                 t
                                        i   N
                                                      t                ij



                                          2  t  T; 1  j  N

    Termination:
                          P  = 1max i
                                            i   N
                                                          T



                          q = arg 1max i
                              T
                                                    i   N
                                                                   T




    Path state sequence backtracking:
           q =  +1q+1; t = T , 1; T , 2; : : : ; 1

                                                                                                             
            t         t       t
'                                     HMM Tutorial c Tapas Kanungo 25
                                                                        $
            Solution to Problem 3

    Estimate  = A; B;  to maximize P Oj
    No analytic method because of complexity it-
    erative solution.
    Baum-Welch Algorithm:
    1. Let initial model be 0:
    2. Compute new  based on 0 and observation
       O:
    3. If log P Oj , log P Oj0 DELTA stop.
    4. Else set 0   and goto step 2.
'                                                                        HMM Tutorial c Tapas Kanungo 26
                                                                                                           $
         Baum-Welch: Preliminaries

    De ne i; j  as the probability of being in state
    i at time t and in state j at time t + 1:
                              ia b o +1 +1j 
                  i; j  =   t         ij


                                   P Oj
                                              j        t         t




                                   ia b o +1 +1j 
                           = P P         t            ij   j    t        t


                              =1 =1 ia b o +1 +1j 
                               N         N
                               i         j            t         ij   j   t     t




    De ne i as probability of being in state i at
                       t



    time t; given the observation sequence.
                                                  X
                                       i =                i; j 
                                                      N



                                                      =1
                                   t                        t
                                                  j



    PT
     t  =1   tis the expected number of times state i
                 i
    is visited.
    P ,1
     T
     t  =1   tis the expected number of transitions
          i; j 
    from state i to state j:
'                                                     HMM Tutorial c Tapas Kanungo 27
                                                                                        $
          Baum-Welch: Update Rules

     = expected frequency in state i at time t = 1
    i



    = 1i:
    a = expected number of transition from state i
    ij



    to state j  expected nubmer of transitions from
    state i:               P
                       a = P i;ij 
                        ij
                                t


                                      t


     k = expected number of
    b
    j                         times in state j and
    observing symbol k expected number of times
    in state j :         P
                     k =     t =k           j 
                    bj
                              t;o
                               P
                                              t


                                            j 
                                    t
'                                  HMM Tutorial c Tapas Kanungo 28
                                                                     $
                  Properties

    Covariance of the estimated parameters
    Convergence rates
'                        HMM Tutorial c Tapas Kanungo 29
                                                           $
              Types of HMM

    Continuous density
    Ergodic
    State duration
'                          HMM Tutorial c Tapas Kanungo 30
                                                             $
          Implementation Issues

    Scaling
    Initial parameters
    Multiple observation
'                                     HMM Tutorial c Tapas Kanungo 31
                                                                        $
          Comparison of HMMs

    What is a natural distance function?
    If 1; 2 is large, does it mean that the models
    are really di erent?

More Related Content

1999 hidden markov models

  • 1. ' HMM Tutorial c Tapas Kanungo 1 $ Hidden Markov Models Tapas Kanungo Center for Automation Research University of Maryland Web: www.cfar.umd.edu ~kanungo Email: kanungo@cfar.umd.edu &
  • 2. ' HMM Tutorial c Tapas Kanungo 2 $ Outline 1. Markov models 2. Hidden Markov models 3. Forward Backward algorithm 4. Viterbi algorithm 5. Baum-Welch estimation algorithm
  • 3. ' HMM Tutorial c Tapas Kanungo 3 $ Markov Models Observable states: 1; 2; : : : ; N Observed sequence: q1; q2; : : : ; q ; : : : ; q t T First order Markov assumption: P q = j jq ,1 = i; q ,2 = k; : : : = P q = j jq ,1 = i t t t t t Stationarity: P q = j jq ,1 = i = P q + = j jq + ,1 = i t t t l t l
  • 4. ' HMM Tutorial c Tapas Kanungo 4 $ Markov Models State transition matrix A : 2 3 6 a11 a12 a1 6 a1 7 7 6 6 j N 7 7 6 7 6 21 a22 a2 6 a 6 a2 7 7 7 6 j N 7 6 . 6 6 . .. .. .. 7 7 6 7 7 A=6 6 6 7 7 7 6 1 a 2 a 6 a 6 a 7 7 7 6 i i ij iN 7 6 . 6 6 . .. .. .. 7 7 6 6 6 7 7 7 7 6 7 1 a 2 a 4a a 5 N N Nj NN where a = P q = j jq ,1 = i ij t t 1 i; j; N Constraints on a : ij a 0; ij 8i; j X a = 1; 8i N =1 ij j
  • 5. ' HMM Tutorial c Tapas Kanungo 5 $ Markov Models: Example States: 1. Rainy R 2. Cloudy C 3. Sunny S State transition probability matrix: 2 3 6 0:4 6 0:3 0:3 7 7 6 6 7 7 A = 6 0:2 6 6 6 0:6 7 0:2 7 7 7 6 6 7 7 4 5 0:1 0:1 0:8 Compute the probability of observing SSRRSCS given that today is S .
  • 6. ' HMM Tutorial c Tapas Kanungo 6 $ Markov Models: Example Basic conditional probability rule: P A; B = P AjB P B The Markov chain rule: P q1; q2; : : : ; q T = P q jq1; q2 ; : : : ; q ,1P q1 ; q2; : : : ; q ,1 T T T = P q jq ,1P q1 ; q2; : : : ; q ,1 T T T = P q jq ,1P q ,1jq ,2P q1 ; q2; : : : ; q ,2 T T T T T = P q jq ,1P q ,1jq ,2 P q2 jq1 P q1 T T T T
  • 7. ' HMM Tutorial c Tapas Kanungo 7 $ Markov Models: Example Observation sequence O : O = S; S; S; R; R; S; C; S Using the chain rule we get: P Ojmodel = P S; S; S; R; R; S; C; S jmodel = P S P S jS P S jS P RjS P RjR P S jRP C jS P S jC = 3a33 a33 a31a11 a13 a32 a23 = 10:82 0:10:40:30:10:2 = 1:536 10,4 The prior probability = P q1 = i i
  • 8. ' HMM Tutorial c Tapas Kanungo 8 $ Markov Models: Example What is the probability that the sequence re- mains in state i for exactly d time units? i 6 p d = P q1 = i; q2 = i; : : : ; q = i; q +1 = i; : : :d d = a ,1 1 , a i ii d ii Exponential Markov chain duration density. What is the expected value of the duration d in state i? 1 X d = i dp d i =1 1 d X = da ,11 , a ii d ii =1 1 d X = 1 , a ii da ,1 ii d =1 @ X a d 1 = 1 , a d @a 0=1 1ii ii @ a A ii d = 1 , a @ ii @a 1 , aii 1 ii ii = 1,a ii
  • 9. ' HMM Tutorial c Tapas Kanungo 9 $ Markov Models: Example Avg. number of consecutive sunny days = 1 1 = =5 1 , a33 1 , 0:8 Avg. number of consecutive cloudy days = 2.5 Avg. number of consecutive rainy days = 1.67
  • 10. ' HMM Tutorial c Tapas Kanungo 10 $ Hidden Markov Models States are not observable Observations are probabilistic functions of state State transitions are still probabilistic
  • 11. ' HMM Tutorial c Tapas Kanungo 11 $ Urn and Ball Model N urns containing colored balls M distinct colors of balls Each urn has a possibly di erent distribution of colors Sequence generation algorithm: 1. Pick initial urn according to some random process. 2. Randomly pick a ball from the urn and then replace it 3. Select another urn according a random selec- tion process associated with the urn 4. Repeat steps 2 and 3
  • 12. ' HMM Tutorial c Tapas Kanungo 12 $ The Trellis N STATES 4 3 2 1 1 2 t-1 t t+1 t+2 T-1 T TIME o1 o2 o t-1 ot o t+1 o t+2 o T-1 oT OBSERVATION
  • 13. ' HMM Tutorial c Tapas Kanungo 13 $ Elements of Hidden Markov Models N the number of hidden states Q set of states Q = f1; 2; : : : ; N g M the number of symbols V set of symbols V = f1; 2; : : : ; M g A the state-transition probability matrix. a = P q +1 = j jq = i 1 i; j; N ij t t B Observation probability distribution: B k = P o = kjq = j 1 k M j t t the initial state distribution: = P q1 = i 1 i N i the entire model = A; B;
  • 14. ' HMM Tutorial c Tapas Kanungo 14 $ Three Basic Problems 1. Given observation O = o1; o2; : : : ; o and model T = A; B; ; e ciently compute P Oj: Hidden states complicate the evaluation Given two models 1 and 2, this can be used to choose the better one. 2. Given observation O = o1; o2; : : : ; o and model T nd the optimal state sequence q = q1; q2; : : : ; q : T Optimality criterion has to be decided e.g. maximum likelihood Explanation for the data. 3. Given O = o1; o2; : : : ; o ; estimate model parame- T ters = A; B; that maximize P Oj:
  • 15. ' HMM Tutorial c Tapas Kanungo 15 $ Solution to Problem 1 Problem: Compute P o1; o2; : : : ; o j T Algorithm: Let q = q1; q2; : : : ; q be a state sequence. T Assume the observations are independent: Y P Ojq; = P o jq ; T =1 t t i = b o1 b o2 b T o q1 q2 q T Probability of a particular state sequence is: P qj = a a a T , q1 q1 q2 q2 q3 q 1 qT Also, P O; qj = P Ojq; P qj Enumerate paths and sum probabilities: P Oj = X P Ojq; P qj q N state sequences and OT calculations. T Complexity: OTN calculations. T
  • 16. ' HMM Tutorial c Tapas Kanungo 16 $ Forward Procedure: Intuition N aNk STATES a3k 3 k 2 a1K 1 t t+1 TIME
  • 17. ' HMM Tutorial c Tapas Kanungo 17 $ Forward Algorithm De ne forward variable i as: t t i = P o1 ; o2; : : : ; o ; q = ij t t t is the probability of observing the partial i sequence o1; o2 : : : ; o such that the state q is i. t t Induction: 1. Initialization: 1i = b o1 i i 2. Induction: 2 3 X +1j = 4 ia 5b o +1 N =1 t t ij j t i 3. Termination: X P Oj = i N =1 T i Complexity: ON 2T :
  • 18. ' HMM Tutorial c Tapas Kanungo 18 $ Example Consider the following coin-tossing experiment: State 1 State 2 State 3 PH 0.5 0.75 0.25 PT 0.5 0.25 0.75 state-transition probabilities equal to 1 3 initial state probabilities equal to 1 3 1. You observe O = H; H; H; H; T; H; T; T; T; T : What state sequence, q; is most likely? What is the joint probability, P O; qj; of the observation se- quence and the state sequence? 2. What is the probability that the observation se- quence came entirely of state 1?
  • 19. ' HMM Tutorial c Tapas Kanungo 19 $ 3. Consider the observation sequence ~ O = H; T; T; H; T; H; H; T; T; H : How would your answers to parts 1 and 2 change? 4. If the state transition probabilities were: 2 3 6 0:9 6 0:45 0:45 7 7 6 7 0=6 A 6 0:05 7 7 6 6 6 0:1 0:45 7 ; 7 7 6 6 7 7 4 5 0:05 0:45 0:1 how would the new model 0 change your answers to parts 1-3?
  • 20. ' HMM Tutorial c Tapas Kanungo 20 $ Backward Algorithm N 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 aiN 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 111111 000000 1111111 0000000 111111 000000 1111111 0000000 111111 000000 STATES 1111111 0000000 111111 000000 111111 000000 1111111 0000000 1111111 0000000 111111 000000 111111 000000 1111111 0000000 1111111 0000000 111111 000000 i0000000 1111111 111111 000000 4 1111111 0000000 111111 000000 1111111 0000000 111111 000000 1111111 0000000 1111111 0000000 1111111 0000000 3 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 ai1 1111111 0000000 2 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1 1111111 0000000 1111111 0000000 1 2 t-1 t t+1 t+2 T-1 T TIME o1 o2 o t-1 ot o t+1 o t+2 o T-1 oT OBSERVATION
  • 21. ' HMM Tutorial c Tapas Kanungo 21 $ Backward Algorithm De ne backward variable i as: t t i = P o +1; o +2; : : : ; o jq = i; t t T t t is the probability of observing the partial i sequence o +1; o +2 : : : ; o such that the state q is t t T t i. Induction: 1. Initialization: i = 1 T 2. Induction: X i = a b o +1 +1j ; N =1 t ij j t t j 1 i N; t = T , 1; : : : ; 1
  • 22. ' HMM Tutorial c Tapas Kanungo 22 $ Solution to Problem 2 Choose the most likely path Find the path q1; q ; : : : ; q that maximizes the t T likelihood: P q1; q2; : : : ; q jO; T Solution by Dynamic Programming De ne: t i = 1 maxt,1 P q1 ; q2; : : : ; q = i; o1; o2 ; : : : ; o j 2 q ;q ;:::;q t t t i is the highest prob. path ending in state i By induction we have: t+1j = max i t ia b o +1 ij j t
  • 23. ' HMM Tutorial c Tapas Kanungo 23 $ Viterbi Algorithm N 1111111 0000000 1111111 0000000 aNk 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 k 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 STATES 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 4 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 3 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 a1k 1111111 0000000 2 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1 1111111 0000000 1111111 0000000 1 2 t-1 t t+1 t+2 T-1 T TIME o1 o2 o t-1 ot o t+1 o t+2 o T-1 oT OBSERVATION
  • 24. ' HMM Tutorial c Tapas Kanungo 24 $ Viterbi Algorithm Initialization: 1 i = b o1; i i 1iN 1i = 0 Recursion: tj = 1max ,1ia b o i N t ij j t j = arg 1max ,1ia t i N t ij 2 t T; 1 j N Termination: P = 1max i i N T q = arg 1max i T i N T Path state sequence backtracking: q = +1q+1; t = T , 1; T , 2; : : : ; 1 t t t
  • 25. ' HMM Tutorial c Tapas Kanungo 25 $ Solution to Problem 3 Estimate = A; B; to maximize P Oj No analytic method because of complexity it- erative solution. Baum-Welch Algorithm: 1. Let initial model be 0: 2. Compute new based on 0 and observation O: 3. If log P Oj , log P Oj0 DELTA stop. 4. Else set 0 and goto step 2.
  • 26. ' HMM Tutorial c Tapas Kanungo 26 $ Baum-Welch: Preliminaries De ne i; j as the probability of being in state i at time t and in state j at time t + 1: ia b o +1 +1j i; j = t ij P Oj j t t ia b o +1 +1j = P P t ij j t t =1 =1 ia b o +1 +1j N N i j t ij j t t De ne i as probability of being in state i at t time t; given the observation sequence. X i = i; j N =1 t t j PT t =1 tis the expected number of times state i i is visited. P ,1 T t =1 tis the expected number of transitions i; j from state i to state j:
  • 27. ' HMM Tutorial c Tapas Kanungo 27 $ Baum-Welch: Update Rules = expected frequency in state i at time t = 1 i = 1i: a = expected number of transition from state i ij to state j expected nubmer of transitions from state i: P a = P i;ij ij t t k = expected number of b j times in state j and observing symbol k expected number of times in state j : P k = t =k j bj t;o P t j t
  • 28. ' HMM Tutorial c Tapas Kanungo 28 $ Properties Covariance of the estimated parameters Convergence rates
  • 29. ' HMM Tutorial c Tapas Kanungo 29 $ Types of HMM Continuous density Ergodic State duration
  • 30. ' HMM Tutorial c Tapas Kanungo 30 $ Implementation Issues Scaling Initial parameters Multiple observation
  • 31. ' HMM Tutorial c Tapas Kanungo 31 $ Comparison of HMMs What is a natural distance function? If 1; 2 is large, does it mean that the models are really di erent?