際際滷

際際滷Share a Scribd company logo
CONTEXT-BASED PEOPLE RECOGNITION
                                       in CONSUMER PHOTO COLLECTIONS
                                                                                     Markus Brenner, Ebroul Izquierdo
                                                          MMV Research Group, School of Electronic Engineering and Computer Science
                                                                           Queen Mary University of London, UK
                                                                    {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk




    Resolve identities of people primarily by their faces                                       Aim               Perform recognition by considering all contextual information
    Incorporate rich contextual cues of personal photo collections                                                  at the same time (unlike traditional approaches that usually
      where few individual people frequently appear together                                                         train a classifier and then predict identities independently)




Face Detection and Basic Recognition                                                                     Graph-based Recognition

Initial steps: Image preprocessing, face detection and face normalization                                Model: pairwise Markov Network (graph nodes represent faces)


                                                                                                         Unary Potentials: likelihood of faces belonging to                                  Face                        Pairwise
                                                                                                                                                                                                                         potential


                                                                                                         particular people                                                                          f1                                     f2

                                                                                                                                                               1
                                                                                                                                                  ゐ =              ゐ
                                                                                                                                                                                                                                               Unary
                                                                                                                                                                                                                                                potential




Descriptor-based: Local Binary Pattern (LBP) texture histograms                                                                                                                                                    f3



                                                                                                         Pairwise Potentials: encourage spatial smoothness,
                                    LBP

                                                                                                         encode exclusivity constraint and temporal domain
                                                 for each block 
                                                                                                                                             ,    =        
                                                LBP
                                                                                                                            ゐ,         = 0,    =      =  
                                                                                                                                                ,   , ≠ゐ

Similarity metric: Chi-Square Statistics                                                                 Topology: only the most similar faces are                         Unary potential
                                                                                                                                                                           of every node            Tr
                                                                            All samples     Tr                                                                                                                           Tr
                                                                          are independent
                                                                                                         connected with edges
                                                                                                                                                                                                                  Based on face
                                                                                                                                                                                                                  similarities
                                                                     Te         Te          Tr
Basic face recognition: k-Nearest-Neighbor                                                                                                                                                                   Te
                                                                                                         Inference: maximum a posteriori (MAP)
                                                                                            Tr                                                                                                 Te
                                                                                Face
                                                                              similarity                 solution of Loopy Belief Propagation (LBP)                                                                           Tr




Social Semantics                                                                                         Body Detection and Recognition
                                                                                                                                                                              Unary potential
                                                                                                                                                                              of every node                Tr
Individual appearance for a more effective graph                                                          when faces are obscured or invisible                                              Upper body
                                                                                                                                                                                                                                     Tr
                                                                                                                                                                                                                                            Lower
                                                                                                                                                                                              similarity
topology (used to regularize the number of edges)                                                                Detect upper and lower body parts
                                                                                                                                                                                                                                             body
                                                                                                                                                                                                                                           similarity

                                                                                                                                                                                                                        Te
                                                                                                                 Bipartite matching of faces and bodies
                                                                                                                                                                                                    Te
Unique People Constraint models exclusivity:                                                                     Graph-based fusion of faces and clothing                                                   Face
                                                                                                                                                                                                                                          Tr
                                                                                                                                                                                                           similarity
a person cannot appear more than once in a photo


Pairwise co-appearance: people appearing together
bear a higher likelihood of appearing together again


Groups of people: use data mining to
                                                                                     ...
discover frequently appearing social patterns




                                                                                            Experiments
                                                                                                                            Gain @ 3% training
                            Public Gallagher Dataset:                                                    25%
                            ~600 photos, ~800 faces, 32 distinct people
                                                                                                         20%

                            Our dataset:                                                                 15%

                            ~3300 photos, ~5000 faces, 106 distinct people                               10%


                             All photos shot with a typical consumer camera                                 5%


                             Considering only correctly detected faces (87%)                                0%
                                                                                                                   + Graph. Model   + Social Semantics   + Body parts

More Related Content

CUbRIK research presented at SSMS 2012

  • 1. CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science Queen Mary University of London, UK {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk Resolve identities of people primarily by their faces Aim Perform recognition by considering all contextual information Incorporate rich contextual cues of personal photo collections at the same time (unlike traditional approaches that usually where few individual people frequently appear together train a classifier and then predict identities independently) Face Detection and Basic Recognition Graph-based Recognition Initial steps: Image preprocessing, face detection and face normalization Model: pairwise Markov Network (graph nodes represent faces) Unary Potentials: likelihood of faces belonging to Face Pairwise potential particular people f1 f2 1 ゐ = ゐ Unary potential Descriptor-based: Local Binary Pattern (LBP) texture histograms f3 Pairwise Potentials: encourage spatial smoothness, LBP encode exclusivity constraint and temporal domain for each block , = LBP ゐ, = 0, = = , , ≠ゐ Similarity metric: Chi-Square Statistics Topology: only the most similar faces are Unary potential of every node Tr All samples Tr Tr are independent connected with edges Based on face similarities Te Te Tr Basic face recognition: k-Nearest-Neighbor Te Inference: maximum a posteriori (MAP) Tr Te Face similarity solution of Loopy Belief Propagation (LBP) Tr Social Semantics Body Detection and Recognition Unary potential of every node Tr Individual appearance for a more effective graph when faces are obscured or invisible Upper body Tr Lower similarity topology (used to regularize the number of edges) Detect upper and lower body parts body similarity Te Bipartite matching of faces and bodies Te Unique People Constraint models exclusivity: Graph-based fusion of faces and clothing Face Tr similarity a person cannot appear more than once in a photo Pairwise co-appearance: people appearing together bear a higher likelihood of appearing together again Groups of people: use data mining to ... discover frequently appearing social patterns Experiments Gain @ 3% training Public Gallagher Dataset: 25% ~600 photos, ~800 faces, 32 distinct people 20% Our dataset: 15% ~3300 photos, ~5000 faces, 106 distinct people 10% All photos shot with a typical consumer camera 5% Considering only correctly detected faces (87%) 0% + Graph. Model + Social Semantics + Body parts