際際滷

際際滷Share a Scribd company logo
Feature Selection Methods for Bag-
  of-(visual)-Words Approaches
          Schmiedeke, Kelm and Sikora
             Communication Systems Group
              Technische Universit辰t Berlin

                     4 October, 2012
Motivation                                                           2




                     sports


        Schmiedeke: Feature Selection Methods for BoW Approaches
Lessons from last year                                                 3




 Features derived from metadata (esp. tags)
 outperform visual and ASR ones
   Metadata:                 Naive Bayes (non translated)
   Visual feat.:             SVM (avg. pooled histograms)
   ASR transcripts:          kNN (JSD)


 Uploader mainly contribute to a single category




          Schmiedeke: Feature Selection Methods for BoW Approaches
This years question                                                  4




 Does feature selection improve results achieved
 with BoW model?




         Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection/ Transformation                                     5




 Mutual information:



 Term Frequency:



 PCA (Eigenvalue decomposition):




         Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              6




    Concepts for terms selection:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
ministri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              7




    Top-k-Union:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              8




    Top-k:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              9




    Union>th:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)
         0.0002                     0.0002                      0.0001

               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              10




  Intersection>Th:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
                                                     
web                        appl                        gossip
python                     googl                       interview
xbox                       teen                        iphon
big                        music                       san
expo                       tv                          texa
                                                     
daytripp (0.0)             sonnet     (0.0)            ilsr       (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)
         0.0002                     0.0002                      0.0001

               Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           11




  Bag of clustered SURF features transformed
  using PCA
   Result does not benefit from transformation

                          official run        without FS/FT
      mAP                       0.2301              0.2309
      CA                       41.63 %             41.71 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           12




  Bag of filtered ASR transcripts terms (Union>Th)
   Result does benefit from selection


                          official run        without FS/FT
      mAP                       0.1035              0.0522
      CA                       32.53 %             26.54 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           13




  Bag of clustered SURF features filtered using MI
  and intersection>th strategy
   Result does slightly benefit from selection

                          official run        without FS/FT
      mAP                       0.2259              0.2221
      CA                       40.80 %             40.78 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                            14




  Bag of filtered terms derived from tags, title and
  descriptions (Union>Th)
   Result does benefit from selection

                           official run        without FS/FT
       mAP                       0.5225              0.4146
       CA                       58.18 %             55.70 %




            Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           15




  Bag of clustered SURF features transformed
  using PCA and decision fusion using uploader
   Result does benefit from transformation

                          official run        without FS/FT
      mAP                       0.3304              0.2988
      CA                       52.14 %             49.19 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Conclusion & Future Work                                              16




 FS showed potential for improving the results

 Choice of using MI or TF is not critical, both
 methods achieve roughly same results
     Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275)



 Investigation in different scaling schemes (NB)

 Use of class-independent selection score (MI)


         Schmiedeke: Feature Selection Methods for BoW Approaches
Backup                                                                17




         Schmiedeke: Feature Selection Methods for BoW Approaches
Backup                                                                18




         Schmiedeke: Feature Selection Methods for BoW Approaches
Extracting visual features                                             19




  SURF are extracted from each key frame
   At keypoints and at a regular grid


  Vocabulary is built using hierarchical clustering
  on SURF features of development set
   4096/8196 codewords


  Term vector for a single video is obtained by bin-
  wise pooling of each key frames term vector
   avg


          Schmiedeke: Feature Selection Methods for BoW Approaches
MediaEval 2012: Tagging Task                                         20




 Question: What is the videos blip.tv category?
 Blip.tv database (cc): ~ 3300 h
   5288 training videos
   9550 test videos
 Official evaluation measurement is Mean
 Average Precision (mAP)
 Workshop will be held 4-5 October 2012 in Pisa,
 Italy

        Schmiedeke: Feature Selection Methods for BoW Approaches
Ad

Recommended

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
MediaEval2012
Unit 2 boolean algebra and logic gates
Unit 2 boolean algebra and logic gates
AmrutaMehata
An introduction to variable and feature selection
An introduction to variable and feature selection
Marco Meoni
Image retrieval based on feature selection method
Image retrieval based on feature selection method
eSAT Publishing House
Exploratory Analysis of Feature Selection Techniques in Medical Image Processing
Exploratory Analysis of Feature Selection Techniques in Medical Image Processing
Association of Scientists, Developers and Faculties
3. introduction to text mining
3. introduction to text mining
Lokesh Ramaswamy
Using support vector machine with a hybrid feature selection method to the st...
Using support vector machine with a hybrid feature selection method to the st...
lolokikipipi
Text mining
Text mining
Ali A Jalil
Support Vector machine
Support Vector machine
Anandha L Ranganathan
Introduction to Text Mining
Introduction to Text Mining
Minha Hwang
Support Vector Machine without tears
Support Vector Machine without tears
Ankit Sharma
Support Vector Machines
Support Vector Machines
nextlib
Support Vector Machine
Support Vector Machine
Shao-Chuan Wang
Feature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
Textmining Introduction
Textmining Introduction
Datamining Tools
Support Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
Feature selection
Feature selection
Dong Guo
Feature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
Editor IJCATR
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Josef Weingand
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Ravi Tamada
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software

More Related Content

Viewers also liked (12)

Support Vector machine
Support Vector machine
Anandha L Ranganathan
Introduction to Text Mining
Introduction to Text Mining
Minha Hwang
Support Vector Machine without tears
Support Vector Machine without tears
Ankit Sharma
Support Vector Machines
Support Vector Machines
nextlib
Support Vector Machine
Support Vector Machine
Shao-Chuan Wang
Feature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
Textmining Introduction
Textmining Introduction
Datamining Tools
Support Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
Feature selection
Feature selection
Dong Guo
Feature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
Editor IJCATR
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
Introduction to Text Mining
Introduction to Text Mining
Minha Hwang
Support Vector Machine without tears
Support Vector Machine without tears
Ankit Sharma
Support Vector Machines
Support Vector Machines
nextlib
Support Vector Machine
Support Vector Machine
Shao-Chuan Wang
Feature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
Textmining Introduction
Textmining Introduction
Datamining Tools
Support Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
Feature selection
Feature selection
Dong Guo
Feature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
Editor IJCATR
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka

Recently uploaded (20)

Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Josef Weingand
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Ravi Tamada
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
MPU+: A Transformative Solution for Next-Gen AI at the Edge, a Presentation...
MPU+: A Transformative Solution for Next-Gen AI at the Edge, a Presentation...
Edge AI and Vision Alliance
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape sch端tzt, was z辰hlt! Und besonders mit dem neust...
Josef Weingand
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) 際際滷s
Ravi Tamada
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
MPU+: A Transformative Solution for Next-Gen AI at the Edge, a Presentation...
MPU+: A Transformative Solution for Next-Gen AI at the Edge, a Presentation...
Edge AI and Vision Alliance
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
All Things Open
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
Ad

Me12tt tub

  • 1. Feature Selection Methods for Bag- of-(visual)-Words Approaches Schmiedeke, Kelm and Sikora Communication Systems Group Technische Universit辰t Berlin 4 October, 2012
  • 2. Motivation 2 sports Schmiedeke: Feature Selection Methods for BoW Approaches
  • 3. Lessons from last year 3 Features derived from metadata (esp. tags) outperform visual and ASR ones Metadata: Naive Bayes (non translated) Visual feat.: SVM (avg. pooled histograms) ASR transcripts: kNN (JSD) Uploader mainly contribute to a single category Schmiedeke: Feature Selection Methods for BoW Approaches
  • 4. This years question 4 Does feature selection improve results achieved with BoW model? Schmiedeke: Feature Selection Methods for BoW Approaches
  • 5. Feature Selection/ Transformation 5 Mutual information: Term Frequency: PCA (Eigenvalue decomposition): Schmiedeke: Feature Selection Methods for BoW Approaches
  • 6. Feature Selection 6 Concepts for terms selection: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) ministri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 7. Feature Selection 7 Top-k-Union: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 8. Feature Selection 8 Top-k: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 9. Feature Selection 9 Union>th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 10. Feature Selection 10 Intersection>Th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) web appl gossip python googl interview xbox teen iphon big music san expo tv texa daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 11. Official runs 11 Bag of clustered SURF features transformed using PCA Result does not benefit from transformation official run without FS/FT mAP 0.2301 0.2309 CA 41.63 % 41.71 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 12. Official runs 12 Bag of filtered ASR transcripts terms (Union>Th) Result does benefit from selection official run without FS/FT mAP 0.1035 0.0522 CA 32.53 % 26.54 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 13. Official runs 13 Bag of clustered SURF features filtered using MI and intersection>th strategy Result does slightly benefit from selection official run without FS/FT mAP 0.2259 0.2221 CA 40.80 % 40.78 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 14. Official runs 14 Bag of filtered terms derived from tags, title and descriptions (Union>Th) Result does benefit from selection official run without FS/FT mAP 0.5225 0.4146 CA 58.18 % 55.70 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 15. Official runs 15 Bag of clustered SURF features transformed using PCA and decision fusion using uploader Result does benefit from transformation official run without FS/FT mAP 0.3304 0.2988 CA 52.14 % 49.19 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 16. Conclusion & Future Work 16 FS showed potential for improving the results Choice of using MI or TF is not critical, both methods achieve roughly same results Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275) Investigation in different scaling schemes (NB) Use of class-independent selection score (MI) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 17. Backup 17 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 18. Backup 18 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 19. Extracting visual features 19 SURF are extracted from each key frame At keypoints and at a regular grid Vocabulary is built using hierarchical clustering on SURF features of development set 4096/8196 codewords Term vector for a single video is obtained by bin- wise pooling of each key frames term vector avg Schmiedeke: Feature Selection Methods for BoW Approaches
  • 20. MediaEval 2012: Tagging Task 20 Question: What is the videos blip.tv category? Blip.tv database (cc): ~ 3300 h 5288 training videos 9550 test videos Official evaluation measurement is Mean Average Precision (mAP) Workshop will be held 4-5 October 2012 in Pisa, Italy Schmiedeke: Feature Selection Methods for BoW Approaches