This document proposes combining eye-motion and ego-motion features for first-person activity recognition using a wearable camera system. It extracts eye-motion primitives like saccades from gaze coordinates and ego-motion primitives like optical flow from video frames. Wordbooks are generated from the motion primitive sequences and used as features, including n-gram counts and statistical values. These eye-motion and ego-motion wordbook features are combined and classified with an SVM to recognize activities like reading, video watching, writing and copying text.
1 of 24
More Related Content
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
2. First-Person Activity Recognition
? Recognize tasks from [Sun 09]
wearable sensors
C Cameras, accelerometers, an
d IMUs
[Fathi 11]
C Previous: kinematic motions
? Changes in attention could
provide another source of
information
2
3. Previous Work
Activity recognition using changes in attention
1.Eye-motion First person activity recognition
using eye-motion [Bulling 11]
C Independent use of eye-
motion cannot represent
broader changes of attention.
Reading a book
electrooculography
3
4. Previous Work
Activity recognition using changes in attention
1.Eye-motion 2.Ego-motion
Reading a book Copying text from one display to
another
4
5. Eye-motion Characteristics [Bulling 11]
Gaze-x
? Micro level
Intermittent jumps Saccade
? Macro level
Activities involves characteristic
saccade sequences
C Histogram of motion sequence Wordbook
Time
6
6. Eye-motion and Ego-motion
Eye-motion Right
? Reading a book
Left
Right
Ego-motion
? Copying text from
one display to
another
Left
Both of two motions have
Intermittent motion primitives and characteristic sequences
7
7. Our Approach
Our method: Inside-out camera system provide
eye-motion and ego-motion at once.
Eye-motion Ego-motion
8
10. Eye-motion Primitives [Bulling 11]
Inside
Gaze coordinates
Saccade detection
Eye-motion Primitive
Time Threshold & Quantization
Eye-motion Characters
NOUA B
K r b Mn u b
L l r
C
R
K j d f E
Time J HDG F
11
11. Eye-motion Primitives [Bulling 11]
Inside
Gaze coordinates
Saccade detection
Eye-motion Primitive (Saccade)
Time Threshold & Quantization
Eye-motion Characters
NOUA B
K r b Mn u bb C
L l r
r R
KK j d f
J HDG
E
F
Time
11
13. Ego-motion Primitives
Outside
Frames
Global optical flow
Ego-motion Primitive
Time Threshold & Quantization
Ego-motion Characters
NOUA B
Mn u b C
U r b r A L l r
K j d f
R
E
Time J HDG F
11
15. Generating Wordbook [Bulling 11]
Motion primitives U r b r A
Time
Wordbook features
Current Frame
Retrieve 1-4 gram wordbooks
1-gram
Word count
r 100
L 50
H 1
16
16. Generating Wordbook [Bulling 11]
Motion primitives U r b r A
Time
Wordbook features
Current Frame
Retrieve 1-4 gram wordbooks
1-gram 4-gram
Word count Word count
r 100 rrrr 5
L 50
Lrrr 2
H 1 lrlr 1
17
17. Generating Wordbook [Bulling 11]
Motion primitives U r b r A
Time
Wordbook features
Current Frame
Retrieve 1-4 gram wordbooks
1-gram 4-gram Statistical feature
Word count Word count max
r 100 rrrr 5
L 50
Lrrr 2 size
H 1 lrlr 1 range, var, mean
18
18. Generating Wordbook [Bulling 11]
Motion primitives U r b r A
Time
Wordbook features
Current Frame
Retrieve 1-4 gram wordbooks
1-gram 4-gram Statistical feature
Word count Word count max
r 99 rrrr 5
L 50
Lrrr 2 size
H 1 lrlr 1 range, var, mean
19
19. Generating Wordbook [Bulling 11]
Motion primitives U r b r A
Time
Wordbook features
Current Frame
Retrieve 1-4 gram wordbooks
1-gram 4-gram Statistical feature
Word count Word count max
r 100 rrrr 5
L 49
Lrrr 3 size
H 1 lrlr 1 range, var, mean
20
23. Performance Over Tasks
? MOWORD feature improves the performance
? Joint feature improves the performance complementaly
20
24. Summary
Joint use of eye-motion and ego-motion
improves classification performance
Future Work
? Apply to a wider range of tasks
Ego-motion feature might do more on dynamic tasks
? Combined of object detection
Hand, face, or other objects