際際滷

際際滷Share a Scribd company logo
Feature Selection Methods for Bag-
  of-(visual)-Words Approaches
          Schmiedeke, Kelm and Sikora
             Communication Systems Group
              Technische Universit辰t Berlin

                     4 October, 2012
Motivation                                                           2




                     sports


        Schmiedeke: Feature Selection Methods for BoW Approaches
Lessons from last year                                                 3




 Features derived from metadata (esp. tags)
 outperform visual and ASR ones
   Metadata:                 Naive Bayes (non translated)
   Visual feat.:             SVM (avg. pooled histograms)
   ASR transcripts:          kNN (JSD)


 Uploader mainly contribute to a single category




          Schmiedeke: Feature Selection Methods for BoW Approaches
This years question                                                  4




 Does feature selection improve results achieved
 with BoW model?




         Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection/ Transformation                                     5




 Mutual information:



 Term Frequency:



 PCA (Eigenvalue decomposition):




         Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              6




    Concepts for terms selection:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
ministri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              7




    Top-k-Union:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              8




    Top-k:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)



               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              9




    Union>th:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
unleaven(0.0782)           grittv (0.0881)             harta (0.0227)
eeli    (0.0782)           flander (0.0861)            exceric (0.0211)
davideel(0.0781)           laura (0.0855)              yoga    (0.0203)
misistri(0.0780)           economi(0.0747)             study (0.0192)

                                                     

daytripp (0.0)             sonnet   (0.0)              ilsr     (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)
         0.0002                     0.0002                      0.0001

               Schmiedeke: Feature Selection Methods for BoW Approaches
Feature Selection                                                              10




  Intersection>Th:

Top terms for religion:    Top terms for politics:     Top terms for health:
bibl    (0.0897)           lunch (0.1200)              jama    (0.0495)
jesu    (0.0797)           obama (0.1113)              health (0.0378)
god     (0.0796)           polit (0.0982)              report (0.0357)
                                                     
web                        appl                        gossip
python                     googl                       interview
xbox                       teen                        iphon
big                        music                       san
expo                       tv                          texa
                                                     
daytripp (0.0)             sonnet     (0.0)            ilsr       (0.0)
adagio (0.0)               screenplai (0.0)            resystem (0.0)
acustica (0.0)             acustica (0.0)              acustica (0.0)
         0.0002                     0.0002                      0.0001

               Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           11




  Bag of clustered SURF features transformed
  using PCA
   Result does not benefit from transformation

                          official run        without FS/FT
      mAP                       0.2301              0.2309
      CA                       41.63 %             41.71 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           12




  Bag of filtered ASR transcripts terms (Union>Th)
   Result does benefit from selection


                          official run        without FS/FT
      mAP                       0.1035              0.0522
      CA                       32.53 %             26.54 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           13




  Bag of clustered SURF features filtered using MI
  and intersection>th strategy
   Result does slightly benefit from selection

                          official run        without FS/FT
      mAP                       0.2259              0.2221
      CA                       40.80 %             40.78 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                            14




  Bag of filtered terms derived from tags, title and
  descriptions (Union>Th)
   Result does benefit from selection

                           official run        without FS/FT
       mAP                       0.5225              0.4146
       CA                       58.18 %             55.70 %




            Schmiedeke: Feature Selection Methods for BoW Approaches
Official runs                                                           15




  Bag of clustered SURF features transformed
  using PCA and decision fusion using uploader
   Result does benefit from transformation

                          official run        without FS/FT
      mAP                       0.3304              0.2988
      CA                       52.14 %             49.19 %




           Schmiedeke: Feature Selection Methods for BoW Approaches
Conclusion & Future Work                                              16




 FS showed potential for improving the results

 Choice of using MI or TF is not critical, both
 methods achieve roughly same results
     Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275)



 Investigation in different scaling schemes (NB)

 Use of class-independent selection score (MI)


         Schmiedeke: Feature Selection Methods for BoW Approaches
Backup                                                                17




         Schmiedeke: Feature Selection Methods for BoW Approaches
Backup                                                                18




         Schmiedeke: Feature Selection Methods for BoW Approaches
Extracting visual features                                             19




  SURF are extracted from each key frame
   At keypoints and at a regular grid


  Vocabulary is built using hierarchical clustering
  on SURF features of development set
   4096/8196 codewords


  Term vector for a single video is obtained by bin-
  wise pooling of each key frames term vector
   avg


          Schmiedeke: Feature Selection Methods for BoW Approaches
MediaEval 2012: Tagging Task                                         20




 Question: What is the videos blip.tv category?
 Blip.tv database (cc): ~ 3300 h
   5288 training videos
   9550 test videos
 Official evaluation measurement is Mean
 Average Precision (mAP)
 Workshop will be held 4-5 October 2012 in Pisa,
 Italy

        Schmiedeke: Feature Selection Methods for BoW Approaches

More Related Content

Viewers also liked (12)

Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
Anandha L Ranganathan
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
Minha Hwang
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
Ankit Sharma
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
Shao-Chuan Wang
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
Datamining Tools
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksA Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
Editor IJCATR
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
Minha Hwang
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
Ankit Sharma
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
Shao-Chuan Wang
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
Datamining Tools
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksA Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
Editor IJCATR
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka

Recently uploaded (20)

UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
Build with AI on Google Cloud Session #3
Build with AI on Google Cloud Session #3Build with AI on Google Cloud Session #3
Build with AI on Google Cloud Session #3
Margaret Maynard-Reid
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingBedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Zilliz
Temporary Compound microscope slide .pptx
Temporary Compound microscope slide .pptxTemporary Compound microscope slide .pptx
Temporary Compound microscope slide .pptx
Samir Sharma
EaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial KeyEaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial Key
kherorpacca127
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Earley Information Science
Dev Dives: Unlock the future of automation with UiPath Agent Builder
Dev Dives: Unlock the future of automation with UiPath Agent BuilderDev Dives: Unlock the future of automation with UiPath Agent Builder
Dev Dives: Unlock the future of automation with UiPath Agent Builder
UiPathCommunity
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIATHE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
Srivaanchi Nathan
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
Teaching Prompting and Prompt Sharing to End Users.pptx
Teaching Prompting and Prompt Sharing to End Users.pptxTeaching Prompting and Prompt Sharing to End Users.pptx
Teaching Prompting and Prompt Sharing to End Users.pptx
Michael Blumenthal (Microsoft MVP)
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptxUnderstanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
shyamraj55
Not a Kubernetes fan? The state of PaaS in 2025
Not a Kubernetes fan? The state of PaaS in 2025Not a Kubernetes fan? The state of PaaS in 2025
Not a Kubernetes fan? The state of PaaS in 2025
Anthony Dahanne
Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5
Zilliz
Caching for Performance Masterclass: Caching Strategies
Caching for Performance Masterclass: Caching StrategiesCaching for Performance Masterclass: Caching Strategies
Caching for Performance Masterclass: Caching Strategies
ScyllaDB
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
Leadership u automatizaciji: RPA prie iz prakse!
Leadership u automatizaciji: RPA prie iz prakse!Leadership u automatizaciji: RPA prie iz prakse!
Leadership u automatizaciji: RPA prie iz prakse!
UiPathCommunity
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
katalinjordans1
UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
Build with AI on Google Cloud Session #3
Build with AI on Google Cloud Session #3Build with AI on Google Cloud Session #3
Build with AI on Google Cloud Session #3
Margaret Maynard-Reid
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingBedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Zilliz
Temporary Compound microscope slide .pptx
Temporary Compound microscope slide .pptxTemporary Compound microscope slide .pptx
Temporary Compound microscope slide .pptx
Samir Sharma
EaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial KeyEaseUS Partition Master Crack 2025 + Serial Key
EaseUS Partition Master Crack 2025 + Serial Key
kherorpacca127
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...
Earley Information Science
Dev Dives: Unlock the future of automation with UiPath Agent Builder
Dev Dives: Unlock the future of automation with UiPath Agent BuilderDev Dives: Unlock the future of automation with UiPath Agent Builder
Dev Dives: Unlock the future of automation with UiPath Agent Builder
UiPathCommunity
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIATHE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
Srivaanchi Nathan
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptxUnderstanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
shyamraj55
Not a Kubernetes fan? The state of PaaS in 2025
Not a Kubernetes fan? The state of PaaS in 2025Not a Kubernetes fan? The state of PaaS in 2025
Not a Kubernetes fan? The state of PaaS in 2025
Anthony Dahanne
Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5
Zilliz
Caching for Performance Masterclass: Caching Strategies
Caching for Performance Masterclass: Caching StrategiesCaching for Performance Masterclass: Caching Strategies
Caching for Performance Masterclass: Caching Strategies
ScyllaDB
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
Leadership u automatizaciji: RPA prie iz prakse!
Leadership u automatizaciji: RPA prie iz prakse!Leadership u automatizaciji: RPA prie iz prakse!
Leadership u automatizaciji: RPA prie iz prakse!
UiPathCommunity
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
2025-02-27 Tech & Play_ Fun, UX, and Community.pdf
katalinjordans1

Me12tt tub

  • 1. Feature Selection Methods for Bag- of-(visual)-Words Approaches Schmiedeke, Kelm and Sikora Communication Systems Group Technische Universit辰t Berlin 4 October, 2012
  • 2. Motivation 2 sports Schmiedeke: Feature Selection Methods for BoW Approaches
  • 3. Lessons from last year 3 Features derived from metadata (esp. tags) outperform visual and ASR ones Metadata: Naive Bayes (non translated) Visual feat.: SVM (avg. pooled histograms) ASR transcripts: kNN (JSD) Uploader mainly contribute to a single category Schmiedeke: Feature Selection Methods for BoW Approaches
  • 4. This years question 4 Does feature selection improve results achieved with BoW model? Schmiedeke: Feature Selection Methods for BoW Approaches
  • 5. Feature Selection/ Transformation 5 Mutual information: Term Frequency: PCA (Eigenvalue decomposition): Schmiedeke: Feature Selection Methods for BoW Approaches
  • 6. Feature Selection 6 Concepts for terms selection: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) ministri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 7. Feature Selection 7 Top-k-Union: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 8. Feature Selection 8 Top-k: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 9. Feature Selection 9 Union>th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 10. Feature Selection 10 Intersection>Th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) web appl gossip python googl interview xbox teen iphon big music san expo tv texa daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 11. Official runs 11 Bag of clustered SURF features transformed using PCA Result does not benefit from transformation official run without FS/FT mAP 0.2301 0.2309 CA 41.63 % 41.71 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 12. Official runs 12 Bag of filtered ASR transcripts terms (Union>Th) Result does benefit from selection official run without FS/FT mAP 0.1035 0.0522 CA 32.53 % 26.54 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 13. Official runs 13 Bag of clustered SURF features filtered using MI and intersection>th strategy Result does slightly benefit from selection official run without FS/FT mAP 0.2259 0.2221 CA 40.80 % 40.78 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 14. Official runs 14 Bag of filtered terms derived from tags, title and descriptions (Union>Th) Result does benefit from selection official run without FS/FT mAP 0.5225 0.4146 CA 58.18 % 55.70 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 15. Official runs 15 Bag of clustered SURF features transformed using PCA and decision fusion using uploader Result does benefit from transformation official run without FS/FT mAP 0.3304 0.2988 CA 52.14 % 49.19 % Schmiedeke: Feature Selection Methods for BoW Approaches
  • 16. Conclusion & Future Work 16 FS showed potential for improving the results Choice of using MI or TF is not critical, both methods achieve roughly same results Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275) Investigation in different scaling schemes (NB) Use of class-independent selection score (MI) Schmiedeke: Feature Selection Methods for BoW Approaches
  • 17. Backup 17 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 18. Backup 18 Schmiedeke: Feature Selection Methods for BoW Approaches
  • 19. Extracting visual features 19 SURF are extracted from each key frame At keypoints and at a regular grid Vocabulary is built using hierarchical clustering on SURF features of development set 4096/8196 codewords Term vector for a single video is obtained by bin- wise pooling of each key frames term vector avg Schmiedeke: Feature Selection Methods for BoW Approaches
  • 20. MediaEval 2012: Tagging Task 20 Question: What is the videos blip.tv category? Blip.tv database (cc): ~ 3300 h 5288 training videos 9550 test videos Official evaluation measurement is Mean Average Precision (mAP) Workshop will be held 4-5 October 2012 in Pisa, Italy Schmiedeke: Feature Selection Methods for BoW Approaches