1) The document proposes a context-based approach to recognize people in consumer photo collections by incorporating rich contextual cues, unlike traditional classifiers that predict identities independently.
2) It models the problem as a graph-based Markov network where faces are nodes and pairwise potentials encourage spatial smoothness based on face similarities and exclusivity constraints.
3) The approach is improved by incorporating social semantics like frequent co-appearances and unique people constraints, as well as detecting and matching body parts to recognize obscured faces.
1 of 1
Download to read offline
More Related Content
CUbRIK research presented at SSMS 2012
1. CONTEXT-BASED PEOPLE RECOGNITION
in CONSUMER PHOTO COLLECTIONS
Markus Brenner, Ebroul Izquierdo
MMV Research Group, School of Electronic Engineering and Computer Science
Queen Mary University of London, UK
{markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk
Resolve identities of people primarily by their faces Aim Perform recognition by considering all contextual information
Incorporate rich contextual cues of personal photo collections at the same time (unlike traditional approaches that usually
where few individual people frequently appear together train a classifier and then predict identities independently)
Face Detection and Basic Recognition Graph-based Recognition
Initial steps: Image preprocessing, face detection and face normalization Model: pairwise Markov Network (graph nodes represent faces)
Unary Potentials: likelihood of faces belonging to Face Pairwise
potential
particular people f1 f2
1
ゐ = ゐ
Unary
potential
Descriptor-based: Local Binary Pattern (LBP) texture histograms f3
Pairwise Potentials: encourage spatial smoothness,
LBP
encode exclusivity constraint and temporal domain
for each block
, =
LBP
ゐ, = 0, = =
, , ≠ゐ
Similarity metric: Chi-Square Statistics Topology: only the most similar faces are Unary potential
of every node Tr
All samples Tr Tr
are independent
connected with edges
Based on face
similarities
Te Te Tr
Basic face recognition: k-Nearest-Neighbor Te
Inference: maximum a posteriori (MAP)
Tr Te
Face
similarity solution of Loopy Belief Propagation (LBP) Tr
Social Semantics Body Detection and Recognition
Unary potential
of every node Tr
Individual appearance for a more effective graph when faces are obscured or invisible Upper body
Tr
Lower
similarity
topology (used to regularize the number of edges) Detect upper and lower body parts
body
similarity
Te
Bipartite matching of faces and bodies
Te
Unique People Constraint models exclusivity: Graph-based fusion of faces and clothing Face
Tr
similarity
a person cannot appear more than once in a photo
Pairwise co-appearance: people appearing together
bear a higher likelihood of appearing together again
Groups of people: use data mining to
...
discover frequently appearing social patterns
Experiments
Gain @ 3% training
Public Gallagher Dataset: 25%
~600 photos, ~800 faces, 32 distinct people
20%
Our dataset: 15%
~3300 photos, ~5000 faces, 106 distinct people 10%
All photos shot with a typical consumer camera 5%
Considering only correctly detected faces (87%) 0%
+ Graph. Model + Social Semantics + Body parts