際際滷

際際滷Share a Scribd company logo
A Large-scale hierarchical image database
拆悋擯悋惡慍惘擯 悋愕 惆惘 惘悋惠惡 愕愕 惠惶惘 悋 惆悋惆
ImageNet
research fellow : Reza Pourabbas , Saber Qasemian
Outline
 1. Introduction
 2. Properties of ImageNet
 3. Constructing ImageNet
 4. ImageNet Applications
 5. Future Work
 The digital era has brought with it an enormous explosion of data.
 More robust models and algorithms can be proposed by exploiting these
images.
 How such data can be utilized and organized ?
 In this paper, introduce a new image database called ImageNet.
Introduction 1
 ImageNet uses the hierarchical structure of WordNet [9].
Each meaningful concept in WordNet, possibly described
by multiple words or word phrase, is called a synset.
 In ImageNet, we aim to provide on average 500-1000 images to illustrate each synset.
 Images of each concept are quality-controlled.
 ImageNet, therefore, will offer tens of millions of cleanly sorted images.
Introduction
Introduction
2Properties of ImageNet
2.1 Scale
2.2 Hierarchy
2.3 Accuracy
2.3.1 Clean dataset
2.4 Diversity
2.4.1 Variable appearances, positions, view points
ScaleProperties of ImageNet
3.2 million images
Over 5274 categories
Over 600 images for each synset
HierarchyProperties of ImageNet
Figure 3: Comparison of the cat and cattle subtrees between ESP [25] and ImageNet.
 80 synsets randomly sampled
 Average of 99.7% precision
Accuracy
AccuracyProperties of ImageNet
Figure 4: Percent of clean images at different tree depth levels in ImageNet.
DiversityProperties of ImageNet
(a) Comparison of the lossless JPG 鍖le sizes of average images for four different synsets in ImageNet and Caltech101.
(b) Example images from ImageNet and average images for each synset indicated by (a).
(c) Examples images from Caltech101 and average images.
Figure 5: ImageNet provides diversi鍖ed images.
Related
Datasets
Properties of ImageNet
 Small image datasets
A number of well labeled small datasets (Caltech101/256, MSRC, PASCAL etc.) .
Most common to use of todays computer vision .
ImageNet offers 20x the number of categories, and 100x the number of total
images than these datasets .
Properties of ImageNet Related
Datasets
 TinyImage
- TinyImage is a dataset of 80 million 32  32 low resolution images .
- Each synset contains an average of 1000 images .
- 10-25% are possibly clean images .
 LabelMe and Lotus Hill datasets
- Provide 30k and 50k labeled and segmented images .
- Both have around 200 categories .
- Outlines and locations of objects are provided .
- Images are largely uploaded or provided by users or researchers of the datasets .
Properties of ImageNet
ESP dataset
Acquire through an online game .
Labels largely concentrate at the basic level
of the semantic hierarchy .
Most of the ESP dataset is not publicly available .
Only 60k images can be accessed .
ESP dataset
Figure 6: Comparison of the distribution of mammal labels .
Properties of ImageNet
Table 1: Comparison of some of the properties of ImageNet versus other existing datasets.
Constructing ImageNet
Collecting Candidate Images
Collect from the Internet by querying several image search engines .
The queries are the set of WordNet synonyms .
Search engines typically limit the number of images .
Expend the query set by appending the queries with the word from parent synset .
Translate the queries into other languages, including Chinese, Spanish, Dutch and Italian .
3
Constructing ImageNet
Cleaning Candidate Image
 Rely on humans to verify each candidate image .
 By using the service of Amazon Mechanical Turk (AMT).
 Ask the users to verify whether each image contains objects
of the synset .
 Have multiple users independently label the same image .
among users . Different categories require different levels of consensus
 A simple algorithm to determine the number of agreements needed for
different categories of images .
 For each synset, randomly sample an initial subset of images At least
10 users to vote .
 Obtain a con鍖dence score table .
 For each of remaining candidate images, proceed with the AMT user
labeling until a con鍖dence score threshold is reached .
Constructing ImageNet
Object Recognition
- NN-voting + noisy ImageNet
 Use original candidate images .
 Down sample to 32 x 32 .
- NN-voting + clean ImageNet
 Use clean images .
ImageNet Applications 4
 NBNN
- SIFT descriptors are used .
- Compute the query-class distance .
 NBNN-100
- Limit the number of images per category to 100 .
ImageNet Applications
 Tree Based Image Classi鍖cation
 A simple object classi鍖cation method tree-max classi鍖er .
 Imagine you have a classi鍖er at each synset node of the tree
and you want to decide whether an image contains an object
of that synset or not .
 The maximum of all the classi鍖er responses in this subtree
becomes the classi鍖cation score of the query image .
ImageNet Applications
ImageNet Applications
Automatic Object Localization
[14] L.-J. Li, G. Wang, and L. Fei-Fei. OPTIMOL: automatic Online Picture
collection via Incremental Model Learning.
We annotated 100 images in 22 different categories of the mammal
and vehicle subtrees with bounding boxes around the objects of that category.
ImageNet Applications
Future Work
 Completing ImageNet
- Have roughly 50 million clean, diverse and full resolution
images spread over approximately 50K synsets
- Make it publicly available and readily accessible online .
- Extend ImageNet to include more information .
 Foster an ImageNet community and develop an online platform .
5

More Related Content

A Large-scale hierarchical image database

  • 1. A Large-scale hierarchical image database 拆悋擯悋惡慍惘擯 悋愕 惆惘 惘悋惠惡 愕愕 惠惶惘 悋 惆悋惆 ImageNet research fellow : Reza Pourabbas , Saber Qasemian
  • 2. Outline 1. Introduction 2. Properties of ImageNet 3. Constructing ImageNet 4. ImageNet Applications 5. Future Work
  • 3. The digital era has brought with it an enormous explosion of data. More robust models and algorithms can be proposed by exploiting these images. How such data can be utilized and organized ? In this paper, introduce a new image database called ImageNet. Introduction 1
  • 4. ImageNet uses the hierarchical structure of WordNet [9]. Each meaningful concept in WordNet, possibly described by multiple words or word phrase, is called a synset. In ImageNet, we aim to provide on average 500-1000 images to illustrate each synset. Images of each concept are quality-controlled. ImageNet, therefore, will offer tens of millions of cleanly sorted images. Introduction
  • 6. 2Properties of ImageNet 2.1 Scale 2.2 Hierarchy 2.3 Accuracy 2.3.1 Clean dataset 2.4 Diversity 2.4.1 Variable appearances, positions, view points
  • 7. ScaleProperties of ImageNet 3.2 million images Over 5274 categories Over 600 images for each synset
  • 8. HierarchyProperties of ImageNet Figure 3: Comparison of the cat and cattle subtrees between ESP [25] and ImageNet.
  • 9. 80 synsets randomly sampled Average of 99.7% precision Accuracy AccuracyProperties of ImageNet Figure 4: Percent of clean images at different tree depth levels in ImageNet.
  • 10. DiversityProperties of ImageNet (a) Comparison of the lossless JPG 鍖le sizes of average images for four different synsets in ImageNet and Caltech101. (b) Example images from ImageNet and average images for each synset indicated by (a). (c) Examples images from Caltech101 and average images. Figure 5: ImageNet provides diversi鍖ed images.
  • 11. Related Datasets Properties of ImageNet Small image datasets A number of well labeled small datasets (Caltech101/256, MSRC, PASCAL etc.) . Most common to use of todays computer vision . ImageNet offers 20x the number of categories, and 100x the number of total images than these datasets .
  • 12. Properties of ImageNet Related Datasets TinyImage - TinyImage is a dataset of 80 million 32 32 low resolution images . - Each synset contains an average of 1000 images . - 10-25% are possibly clean images . LabelMe and Lotus Hill datasets - Provide 30k and 50k labeled and segmented images . - Both have around 200 categories . - Outlines and locations of objects are provided . - Images are largely uploaded or provided by users or researchers of the datasets .
  • 13. Properties of ImageNet ESP dataset Acquire through an online game . Labels largely concentrate at the basic level of the semantic hierarchy . Most of the ESP dataset is not publicly available . Only 60k images can be accessed . ESP dataset Figure 6: Comparison of the distribution of mammal labels .
  • 14. Properties of ImageNet Table 1: Comparison of some of the properties of ImageNet versus other existing datasets.
  • 15. Constructing ImageNet Collecting Candidate Images Collect from the Internet by querying several image search engines . The queries are the set of WordNet synonyms . Search engines typically limit the number of images . Expend the query set by appending the queries with the word from parent synset . Translate the queries into other languages, including Chinese, Spanish, Dutch and Italian . 3
  • 16. Constructing ImageNet Cleaning Candidate Image Rely on humans to verify each candidate image . By using the service of Amazon Mechanical Turk (AMT). Ask the users to verify whether each image contains objects of the synset . Have multiple users independently label the same image . among users . Different categories require different levels of consensus
  • 17. A simple algorithm to determine the number of agreements needed for different categories of images . For each synset, randomly sample an initial subset of images At least 10 users to vote . Obtain a con鍖dence score table . For each of remaining candidate images, proceed with the AMT user labeling until a con鍖dence score threshold is reached . Constructing ImageNet
  • 18. Object Recognition - NN-voting + noisy ImageNet Use original candidate images . Down sample to 32 x 32 . - NN-voting + clean ImageNet Use clean images . ImageNet Applications 4
  • 19. NBNN - SIFT descriptors are used . - Compute the query-class distance . NBNN-100 - Limit the number of images per category to 100 . ImageNet Applications
  • 20. Tree Based Image Classi鍖cation A simple object classi鍖cation method tree-max classi鍖er . Imagine you have a classi鍖er at each synset node of the tree and you want to decide whether an image contains an object of that synset or not . The maximum of all the classi鍖er responses in this subtree becomes the classi鍖cation score of the query image . ImageNet Applications
  • 21. ImageNet Applications Automatic Object Localization [14] L.-J. Li, G. Wang, and L. Fei-Fei. OPTIMOL: automatic Online Picture collection via Incremental Model Learning. We annotated 100 images in 22 different categories of the mammal and vehicle subtrees with bounding boxes around the objects of that category.
  • 23. Future Work Completing ImageNet - Have roughly 50 million clean, diverse and full resolution images spread over approximately 50K synsets - Make it publicly available and readily accessible online . - Extend ImageNet to include more information . Foster an ImageNet community and develop an online platform . 5