The document summarizes a dataset of over 10,000 fashion and clothing images collected from Flickr and annotated with metadata and labels. Key details include:
- The dataset contains images from 262 fashion categories with on average 122 photos per category.
- Images were annotated on Amazon Mechanical Turk based on 6 labels regarding fashion relevance, number of people, professional models, and formality.
- Over 24,000 assignments were completed with high inter-annotator agreement for most labels.
- The dataset aims to support applications in social media analysis, benchmarking tasks, and studying intentional image framing.
1 of 19
Downloaded 13 times
More Related Content
Fashion 10000: An Enriched Dataset of Fashion and Clothing
1. Fashion 10000
An Enriched Dataset of
Fashion and Clothing
Presentation: Michael Riegler, Klagenfurt University & TU Delft
Babak Loni, TU Delft
Lei Yen Cheung, TU Delft
Alessandro Bozzon, TU Delft
Luke Gottlieb, ICSI
Martha Larson, TU Delft
2. Table of Content
? Introduction
? Dataset Collection
? Dataset Annotation
C Statistics
? Applications of Dataset
? Conclusion
3. The Dataset
? Social Images
? At least 10000 fashion-
related images
? Social metadata
? Creative Common images
? Annotated with different
labels
5. Metadata
? Collected in xml and csv format
C Title, description, owner, Tags, Location,
geo-parameters
? Additional metadata: Info, Geos, Context,
Tags, Notes, Favorites, Urls, Comments
6. General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per fashion item 200/ 122.95 / 10
Number of photos with geo annotations 7,933
Total number of comments 58,578
Max/avg/min nr of comments per photo 575 / 7.35/ 1
Total number of tags, photo pairs 460,907
Total number of distinct tags 56,275
Max/avg/min nr of tags per photo 136/ 15.15/ 1
Total number of notes, photo pairs 5,892
Max/avg/min nr of notes per photo 195/ 5.31/ 1
Total number of favorites 37,131
Max/avg/min nr of favorites per photo 20/ 3.61/ 1
Total number of contexts 110,505
Max/avg/min nr of contexts per photo 206/ 3.93/ 1
7. General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per fashion item 200/ 122.95 / 10
Number of photos with geo annotations 7,933
Total number of comments 58,578
Max/avg/min nr of comments per photo 575 / 7.35/ 1
Total number of tags, photo pairs 460,907
Total number of distinct tags 56,275
Max/avg/min nr of tags per photo 136/ 15.15/ 1
Total number of notes, photo pairs 5,892
Max/avg/min nr of notes per photo 195/ 5.31/ 1
Total number of favorites 37,131
Max/avg/min nr of favorites per photo 20/ 3.61/ 1
Total number of contexts 110,505
Max/avg/min nr of contexts per photo 206/ 3.93/ 1
8. General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per fashion item 200/ 122.95 / 10
Number of photos with geo annotations 7,933
Total number of comments 58,578
Max/avg/min nr of comments per photo 575 / 7.35/ 1
Total number of tags, photo pairs 460,907
Total number of distinct tags 56,275
Max/avg/min nr of tags per photo 136/ 15.15/ 1
Total number of notes, photo pairs 5,892
Max/avg/min nr of notes per photo 195/ 5.31/ 1
Total number of favorites 37,131
Max/avg/min nr of favorites per photo 20/ 3.61/ 1
Total number of contexts 110,505
Max/avg/min nr of contexts per photo 206/ 3.93/ 1
9. General Statistics
Pairs fashion item, photo 32,398
Number of distinct fashion categories 262
Max/avg/min nr of photos per fashion item 200/ 122.95 / 10
Number of photos with geo annotations7,933
Total number of comments 58,578
Max/avg/min nr of comments per photo 575 / 7.35/ 1
Total number of tags, photo pairs 460,907
Total number of distinct tags 56,275
Max/avg/min nr of tags per photo 136/ 15.15/ 1
Total number of notes, photo pairs 5,892
Max/avg/min nr of notes per photo 195/ 5.31/ 1
Total number of favorites 37,131
Max/avg/min nr of favorites per photo 20/ 3.61/ 1
Total number of contexts 110,505
Max/avg/min nr of contexts per photo 206/ 3.93/ 1
10. Dataset
Annotation
? Some images might
not be relevant to
fashion and clothing
? The ground truth
differentiates relevant
from non-relevant
11. Dataset
Annotation
? We used AMT to create ground
truth for the images
? The fashion category is described
with a definition from Wikipedia
? 6 questions to create 6 labels for
each of the images
? We also ask about familiarity of
workers with the fashion category
14. HIT Questions (Labels)
Question Possible values
Q1) Fashion / Clothing Related yes C no - notsure
Q2) Specialty clothing item (image
Category)
yes C no - notsure
Q3) Number of people nopeople C onepeople - manypeople
Q4) Professional model or not? yes C no C notapp (not applicable)
Q5) Person wearing fashion? yes C no C noperson C notapp (not
applicable)
Q6) Formal / Informal formalmen - formalwomen -
informalmen informalwomen C other
(cross-dressing or multiple persons) C
notapp (not applicatble)
15. Annotation Statistics
Total number of assignments 24,457
% of rejected assignments 4 %
Total number of unique workers 1470
Avg. number of assignment by each worker 17
Avg. Completion time 127 sec
Avg. familiarity of workers with fashion items 5.8 (range 1-7)
Question 1 2 3 4 5 6
Kappa Value 0.66 0.65 0.85 0.51 0.38 0.48
16. Dataset Statistics
? Using the generated ground truth the
statistics about the images were calculated
Number of fashion related images 18,487
Number of images with many people 7,417
Number of images with one person 9,771
Number of images with no person 13,179
Number of images with intention of showing fashion 9,096
Number of professional fashion images 2,814
17. Applications of the
Dataset
? Developing social media content analysis
C Game with a purpose (domino game)
? Basis for the brave new task in MediaEval
multimedia benchmarking initiative
? Use case for the proof of intentional framing
18. Conclusion
? Fashion dataset
? Six different labels
? AMT generated ground
truth
? Can be used in various
research areas
? Evaluated in the MediaEval
Benchmark
Social Images: From user-generated fashion categories to user-generated images
Should include at least 10000 fashion-related images
Should include social metadata
Only Creative Common Attribution images
Images should be annotated with different labels
Collected in xml and csv format
General information about images are available in xml format
Title, description, owner,..
Tags
Location, geo-parameters
Additional metadata were collected using different Flickr APIs and converted to csv format
Info, Geos, Comments, Context, Tags, Notes, Favorites, Urls
Some images might not be relevant to fashion and clothing
The ground truth for the images differentiates relevant from non-relevant images
Depending on the usage of dataset different type of annotations might be necessary
We used AMT to create ground truth for the images
In each HIT we ask workers to annotate 4 images from one category
The fashion category is described to workers with a definition from Wikipedia
In total we ask 6 questions to create 6 labels for each of the images
We also ask about familiarity of workers with the Fashion Category
The range of value can be from 1 (unfamiliar) to 7 (familiar)
Different options are explained with visual popups
In addition to the csv file generated by AMT, we created another csv to represent ground truth
Each of the images are annotated by 3 workers
The ground truth are generated by majority voting
Kappa statistics have been used to calculate the agreement among annotators
For each of the six questions in the HIT, the agreement among three workers was calculated separately
General statistics about the workers and the hit