ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Tags as Tools for Social Classification Dr. Isabella Peters Department of Information Science Institute for Language and Information Heinrich-Heine-University D¨¹sseldorf, Germany 34th Annual Conference of the German Classification Society, July 2010
Outline Theoretical assumptions :   Social classification can be based on folksonomies Power Tags are most relevant tags Tag distributions on resource level become stable Three   main   research questions: How to build social classifications (automatically) ? Are Power Tags most relevant for a resource? (When do tag distributions become stable?) Results Based on study with students of University of D¨¹sseldorf
Assumption I Social classification can be based on folksonomies  Folksonomy = sum of all tags of all users of a collaborative information service (e.g. delicious) Platform folksonomy vs. resource folksonomy Broad folksonomy (delicious) vs. narrow folksonomy (youtube) Social classification = collaborative knowledge representation with natural-language terms = ¡°social categorization¡±
Assumption I Social classification can be based on folksonomies  Resource folksonomy reflects via tags collective user intelligence   in giving meaning to the resource Most popular tags are the most important tags for the resource  = Power Tags Only observable in broad folksonomies  because of multiple tagging! Folksonomies deliver concept  candidates for social classification
Method I Aim: Finding tag pairs for construction of social classification Step 1: Calculating Power Tags for resource Number  n  of Power Tags depends on type of tag distribution Power law  ?   n  = exponent Inverse-logistic distribution  ?   n  = tags left from turning point Social classification can be based on folksonomies Power Law Inverse-logistic distribution
Method I Step 2: Calculating co-occurrence for Power Tags and tags of platform folksonomy Basis = Power Tags I from resource level Power Tags II = co-occurring tags from platform level Tag pair is most valuable for social categorization ?   Because of reflecting collective user intelligence Social classification can be based on folksonomies Power Tags I Power Tags II
Research Question I Step 3: Determination of Power Tags I and II can be carried out automatically 1) Identifying distribution type 2) Labeling first  n  tags as Power Tags I 3) Identifying co-occurring tags 4) Identifying distribution type 5) Extracting first  n  tags as Power Tags II 6) Combining Power Tags I and Power Tags II as tag pairs Step 4: Intellectual determination of relationship between Power Tags I and Power Tags II  ?  collaborative or individual How to build social classifications (automatically) ?
Research Question I Examples:   1. a) Power Tags I Android 1. b) Power Tags II  Mobile Google 2. a) Power Tags I Web 2.0 2. b) Power Tags II Tools Social Blog  Socialsoftware  Bookmarks How to build social classifications (automatically) ? Community Tagging Web  AJAX online association related  term Google RT association related term mobile RT Android relation descriptor set hierarchy broader term web BT meronymy narrower term partitive blog NTP meronymy narrower term partitive bookmarks NTP meronymy narrower term partitive tagging NTP meronymy narrower term partitive community NTP meronymy narrower term partitive ajax NTP association related  term online RT synonymy used for Socialsoftware UF Web 2.0 relation descriptor set
Assumption II Power Tags are most relevant tags  To build social classifications based on Power Tags an important precondition must be fulfilled: Power Tags ARE the most relevant tags for a resource Problem: relevance judgments as well as tagging behaviour are highly subjective and error-prone (regarding spelling etc.) Is the collective intelligence of users capable of ¡°ironing out¡± too personal and erroneous tags so that all users are satisfied with high-frequent tags?
Method II Power Tags are most relevant tags  Investigation of 30 resources downloaded from delicious in February 2010 Participants: 20 students of Information Science at the HHU D¨¹sseldorf All resources tagged with ¡°folksonomy¡± and tagged from at least 100  users To guarantee that students are technical able to judge relevance of tags To guarantee that broad tag distributions can be used as test sample User evaluation Tag is relevant for resource  = indicated with 1 Tag is not relevant for resource = indicated with 0 Students had access to resource Students did not know the delicious-rank of the tags Relevance distribution of tags for every resource by student judgments
Research Question II Are Power Tags most relevant for a resource?  Determination of relevance: 50% and more of students judged tag as relevant Extraction of Top 10-delicious-tags How many students called these Top 10-tags relevant? Calculation of relative frequency of students relevance judgments ? Pearson ¡Ö 0,49  N = 30
Research Question II Are Power Tags most relevant for a resource?  Result: only the first two tags are relevant Strong indication for Power Tags Problems in relevance judgments Bias to german tags No unification of spelling variants ?  solution: tag gardening (NLP) No combination of phrase tags
Assumption III Tag distributions on resource level become stable  Studies showed that the shape of tag distributions remains stable after reaching a particular number of tags and users Kipp & Campbell (2006) Maarek et al. (2006) Halpin, Robu, & Shepherd (2007) Maass, Kowatsch, & M¨¹nster (2007) Maier & Thalmann (2007)
Assumption III Tag distributions on resource level become stable  If this assumption is true and ¡°stable¡± is considered as No rank permutation of tags appear anymore Relative number of tags does not change anymore it means that ¡­ Power Tags I and II are like controlled vocabulary for a resource Users gained consenus in describing and tagging the resource ¨C visualized in Power Tags Tags in Long Tail of distribution may be synonyms, tags with typing errors, narrower concepts, etc.
Open Research Question III When do tag distributions become stable?  To automate classification processes we need to know after which number of tagging users a tag distribution remains stable and when no changes in the ranking of tags appear anymore After that we can extract  Power Tags for social  classification for the  particular resource
Open Research Question & Method III When do tag distributions become stable?   Comparison of tag distribution with  n  users and final tag distribution (downloaded at a point in time) Calculation of relative frequency of every tag  rel. freq (t 1  ¡­ t n ) for particular user numbers Calculation of average distance between final tag distribution and tag distribution with  n  users  Subtraction of  ¡Ærel. freq (t n ,fd)  of final distribution and  ¡Ærel. freq (t n  ,td)  of tag distribution with  n  users Stability achieved when  ¡Æ rel. freq (t n ,fd)  -  ¡Ærel. freq (t n  ,td) <  threshold value
Conclusion Social Classification can be based on folksonomies ¨C Power Tags are concept candidates Extraction of Power Tags I and II pairs can be carried out automatically Determination of the relationship inherent in tag pairs requires intellectual processing Power Tags are most relevant tags Relevance of tags can be enhanced through unification and combination of similar tags (here: not synonyms but spelling variants)  ?  tag gardening Ongoing research: when do tag distributions become stable?
Conclusion What type of  tag distribution ? Tag  distribution  stable? Extraction of  Power Tags I & II Pairs of  relevant  Power Tags Candidate  vocabulary  Definition of  concepts and of  semantic relations  Intellectual  structuring  Social  knowledge  organization  system  Automatic processing Intellectual processing
Comments?  Questions?   Isabella Peters: isabella.peters@uni-duesseldorf.de Greetings from D¨¹sseldorf! This presentation is available on ºÝºÝߣShare: http://www.slideshare.net/isabellapeters.
References Halpin, H., Robu, V. and Shepherd, H. (2007): The Complex Dynamics of Collaborative Tagging. In: Carey L. Williamson, C. L., Zurko, M. E., Patel-Schneider, P. F. and Shenoy, P. J. (Eds.): Proceedings of the 16th International WWW Conference, Ban, Alberta, Canada. ACM, New York, 211-220. Kipp, M., & Campbell, D. (2006). Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices. In  Proceedings of the 17th Annual Meeting of the American Society for Information Science and Technology, Austin, Texas, USA . Maarek, Y., Marnasse, N., Navon, Y., & Soroka, V. (2006). Tagging the Physical World. In  Proceedings of the Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, Scotland . Maass, W., Kowatsch, T., & M¨¹nster, T. (2007). Vocabulary Patterns in Free-for-all Collaborative Indexing Systems. In  Proceedings of International Workshop on Emergent Semantics and Ontology Evolution, Busan, Korea  (pp. 45¨C57). Maier, R., & Thalmann, S. (2007). Kollaboratives Tagging zur inhaltlichen Beschreibung von Lern- und Wissensressourcen. In R. Tolksdorf & J. Freytag (Eds.),  Proceedings of XML Tage, Berlin, Germany, Proceedings of XML Tage, Berlin, Germany  (pp. 75¨C86). Berlin: Freie Universit?t Berlin. Peters, I. (2009). Folksonomies: Indexing and Retrieval in Web 2.0. Berlin: De Gruyter, Saur. Peters, I., & Stock, W. G. (2010). &quot;Power Tags&quot; in Information Retrieval. Library Hi Tech, 28(1), 81-93. Peters, I., & Weller, K. (2008). Tag Gardening for Folksonomy Enrichment and Maintenance. Webology, 5(3), Article 58, from http://www.webology.ir/2008/ v5n3/a58.html. Stock, W.G. (2006). On Relevance Distributions.  Journal of the American Society for Information Science and Technology , 57(8), 1126-1129.
Ad

Recommended

Aspects of broad folksonomies
Aspects of broad folksonomies
dermotte
?
Semantic Grounding Strategies for Tagbased Recommender Systems
Semantic Grounding Strategies for Tagbased Recommender Systems
dannyijwest
?
Stop thinking, start tagging - Tag Semantics emerge from Collaborative Verbosity
Stop thinking, start tagging - Tag Semantics emerge from Collaborative Verbosity
Inovex GmbH
?
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
rathnaarul
?
Prediction of Reaction towards Textual Posts in Social Networks
Prediction of Reaction towards Textual Posts in Social Networks
Mohamed El-Geish
?
IJSRED-V2I2P09
IJSRED-V2I2P09
IJSRED
?
Evolving Swings (topics) from Social Streams using Probability Model
Evolving Swings (topics) from Social Streams using Probability Model
IJERA Editor
?
A Proposal on Social Tagging Systems Using Tensor Reduction and Controlling R...
A Proposal on Social Tagging Systems Using Tensor Reduction and Controlling R...
ijcsa
?
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
Dan Nguyen
?
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
IJERA Editor
?
Automatic Hate Speech Detection: A Literature Review
Automatic Hate Speech Detection: A Literature Review
Dr. Amarjeet Singh
?
ͶӰƬ 1
ͶӰƬ 1
butest
?
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
es712
?
Social recommender system
Social recommender system
Kapil Kumar
?
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
dnac
?
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
dnac
?
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
Duke Network Analysis Center
?
Ppt
Ppt
Sanyam Gupta
?
Identification of inference attacks on private Information from Social Networks
Identification of inference attacks on private Information from Social Networks
editorjournal
?
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
RESHAN FARAZ
?
Developing a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic Network
Tamer Rezk
?
Political prediction analysis using text mining and deep learning
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
?
02 Network Data Collection
02 Network Data Collection
dnac
?
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
ijaia
?
03 Ego Network Analysis
03 Ego Network Analysis
dnac
?
Acm tist-v3 n4-tist-2010-11-0317
Acm tist-v3 n4-tist-2010-11-0317
StephanieLeBadezet
?
Complex networks - Assortativity
Complex networks - Assortativity
Jaqueline Passos do Nascimento
?
Cataloguing of learning objects using social tagging
Cataloguing of learning objects using social tagging
Luciana Zaina
?
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Shelly D. Farnham, Ph.D.
?
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
BO TRUE ACTIVITIES SL
?

More Related Content

What's hot (19)

Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
Dan Nguyen
?
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
IJERA Editor
?
Automatic Hate Speech Detection: A Literature Review
Automatic Hate Speech Detection: A Literature Review
Dr. Amarjeet Singh
?
ͶӰƬ 1
ͶӰƬ 1
butest
?
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
es712
?
Social recommender system
Social recommender system
Kapil Kumar
?
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
dnac
?
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
dnac
?
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
Duke Network Analysis Center
?
Ppt
Ppt
Sanyam Gupta
?
Identification of inference attacks on private Information from Social Networks
Identification of inference attacks on private Information from Social Networks
editorjournal
?
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
RESHAN FARAZ
?
Developing a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic Network
Tamer Rezk
?
Political prediction analysis using text mining and deep learning
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
?
02 Network Data Collection
02 Network Data Collection
dnac
?
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
ijaia
?
03 Ego Network Analysis
03 Ego Network Analysis
dnac
?
Acm tist-v3 n4-tist-2010-11-0317
Acm tist-v3 n4-tist-2010-11-0317
StephanieLeBadezet
?
Complex networks - Assortativity
Complex networks - Assortativity
Jaqueline Passos do Nascimento
?
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
Dan Nguyen
?
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
IJERA Editor
?
Automatic Hate Speech Detection: A Literature Review
Automatic Hate Speech Detection: A Literature Review
Dr. Amarjeet Singh
?
ͶӰƬ 1
ͶӰƬ 1
butest
?
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
es712
?
Social recommender system
Social recommender system
Kapil Kumar
?
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
dnac
?
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
dnac
?
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
06 Regression with Networks ¨C EGO Networks and Randomization (2017)
Duke Network Analysis Center
?
Identification of inference attacks on private Information from Social Networks
Identification of inference attacks on private Information from Social Networks
editorjournal
?
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
RESHAN FARAZ
?
Developing a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic Network
Tamer Rezk
?
Political prediction analysis using text mining and deep learning
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
?
02 Network Data Collection
02 Network Data Collection
dnac
?
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
ijaia
?
03 Ego Network Analysis
03 Ego Network Analysis
dnac
?

Similar to Tags as tools for social classification (20)

Cataloguing of learning objects using social tagging
Cataloguing of learning objects using social tagging
Luciana Zaina
?
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Shelly D. Farnham, Ph.D.
?
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
BO TRUE ACTIVITIES SL
?
IMT530 Tagging Presentation
IMT530 Tagging Presentation
Michael Braly
?
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
Hang Dong
?
Of Categorizers and Describers: An Evaluation of Quantitative Measures for Ta...
Of Categorizers and Describers: An Evaluation of Quantitative Measures for Ta...
Christian K?rner
?
Pula 5 Giugno 2007
Pula 5 Giugno 2007
Andrea Capocci
?
03. revised paper edit iq
03. revised paper edit iq
IAESIJEECS
?
Folksonomies & social tagging
Folksonomies & social tagging
?. ??? ????
?
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Earley Information Science
?
Folksonomy
Folksonomy
millermax
?
Hybrid Approaches to Taxonomy & Folksonmy
Hybrid Approaches to Taxonomy & Folksonmy
Earley Information Science
?
Ounl Celstec Presentation
Ounl Celstec Presentation
Riina Vuorikari
?
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
Dominik Kowald
?
Folksonomies: Diverse, Democratic and Evolving Classification
Folksonomies: Diverse, Democratic and Evolving Classification
Michael Ryan
?
Taxonomy vs Tagsonomy
Taxonomy vs Tagsonomy
Roxy Pilly
?
The use of social tagging to support the cataloguing of learning objects
The use of social tagging to support the cataloguing of learning objects
Luciana Zaina
?
Learning structured knowledge from social tagging data: a critical review of ...
Learning structured knowledge from social tagging data: a critical review of ...
Hang Dong
?
Inherit Your Tags - Integration of collaborative tagging and tag proposal int...
Inherit Your Tags - Integration of collaborative tagging and tag proposal int...
Birger K¨¹hnel
?
FaceTag - IASummit 2007
FaceTag - IASummit 2007
Andrea Resmini
?
Cataloguing of learning objects using social tagging
Cataloguing of learning objects using social tagging
Luciana Zaina
?
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Shelly D. Farnham, Ph.D.
?
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
BO TRUE ACTIVITIES SL
?
IMT530 Tagging Presentation
IMT530 Tagging Presentation
Michael Braly
?
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
Hang Dong
?
Of Categorizers and Describers: An Evaluation of Quantitative Measures for Ta...
Of Categorizers and Describers: An Evaluation of Quantitative Measures for Ta...
Christian K?rner
?
03. revised paper edit iq
03. revised paper edit iq
IAESIJEECS
?
Folksonomies & social tagging
Folksonomies & social tagging
?. ??? ????
?
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy
Earley Information Science
?
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
HT2016: Influence of Frequency, Recency and Semantic Context on Tag Reuse
Dominik Kowald
?
Folksonomies: Diverse, Democratic and Evolving Classification
Folksonomies: Diverse, Democratic and Evolving Classification
Michael Ryan
?
Taxonomy vs Tagsonomy
Taxonomy vs Tagsonomy
Roxy Pilly
?
The use of social tagging to support the cataloguing of learning objects
The use of social tagging to support the cataloguing of learning objects
Luciana Zaina
?
Learning structured knowledge from social tagging data: a critical review of ...
Learning structured knowledge from social tagging data: a critical review of ...
Hang Dong
?
Inherit Your Tags - Integration of collaborative tagging and tag proposal int...
Inherit Your Tags - Integration of collaborative tagging and tag proposal int...
Birger K¨¹hnel
?
Ad

More from Isabella Peters (7)

Open Science & Altmetrics
Open Science & Altmetrics
Isabella Peters
?
Impulsstatement barcamp wissenschaft 20_forschung neu denken
Isabella Peters
?
Crowdsourcing in Article Evaluation
Crowdsourcing in Article Evaluation
Isabella Peters
?
Folksonomies und Unternehmenskommunikation peters final
Isabella Peters
?
Folksonomies: In General and in Libraries
Folksonomies: In General and in Libraries
Isabella Peters
?
Folksonomies Indexing Und Retrieval In Bibliotheken
Folksonomies Indexing Und Retrieval In Bibliotheken
Isabella Peters
?
Web 2 0 Im Unternehmen Und Wissensmanagement
Isabella Peters
?
Impulsstatement barcamp wissenschaft 20_forschung neu denken
Isabella Peters
?
Crowdsourcing in Article Evaluation
Crowdsourcing in Article Evaluation
Isabella Peters
?
Folksonomies und Unternehmenskommunikation peters final
Isabella Peters
?
Folksonomies: In General and in Libraries
Folksonomies: In General and in Libraries
Isabella Peters
?
Folksonomies Indexing Und Retrieval In Bibliotheken
Folksonomies Indexing Und Retrieval In Bibliotheken
Isabella Peters
?
Web 2 0 Im Unternehmen Und Wissensmanagement
Isabella Peters
?
Ad

Tags as tools for social classification

  • 1. Tags as Tools for Social Classification Dr. Isabella Peters Department of Information Science Institute for Language and Information Heinrich-Heine-University D¨¹sseldorf, Germany 34th Annual Conference of the German Classification Society, July 2010
  • 2. Outline Theoretical assumptions : Social classification can be based on folksonomies Power Tags are most relevant tags Tag distributions on resource level become stable Three main research questions: How to build social classifications (automatically) ? Are Power Tags most relevant for a resource? (When do tag distributions become stable?) Results Based on study with students of University of D¨¹sseldorf
  • 3. Assumption I Social classification can be based on folksonomies Folksonomy = sum of all tags of all users of a collaborative information service (e.g. delicious) Platform folksonomy vs. resource folksonomy Broad folksonomy (delicious) vs. narrow folksonomy (youtube) Social classification = collaborative knowledge representation with natural-language terms = ¡°social categorization¡±
  • 4. Assumption I Social classification can be based on folksonomies Resource folksonomy reflects via tags collective user intelligence in giving meaning to the resource Most popular tags are the most important tags for the resource = Power Tags Only observable in broad folksonomies because of multiple tagging! Folksonomies deliver concept candidates for social classification
  • 5. Method I Aim: Finding tag pairs for construction of social classification Step 1: Calculating Power Tags for resource Number n of Power Tags depends on type of tag distribution Power law ? n = exponent Inverse-logistic distribution ? n = tags left from turning point Social classification can be based on folksonomies Power Law Inverse-logistic distribution
  • 6. Method I Step 2: Calculating co-occurrence for Power Tags and tags of platform folksonomy Basis = Power Tags I from resource level Power Tags II = co-occurring tags from platform level Tag pair is most valuable for social categorization ? Because of reflecting collective user intelligence Social classification can be based on folksonomies Power Tags I Power Tags II
  • 7. Research Question I Step 3: Determination of Power Tags I and II can be carried out automatically 1) Identifying distribution type 2) Labeling first n tags as Power Tags I 3) Identifying co-occurring tags 4) Identifying distribution type 5) Extracting first n tags as Power Tags II 6) Combining Power Tags I and Power Tags II as tag pairs Step 4: Intellectual determination of relationship between Power Tags I and Power Tags II ? collaborative or individual How to build social classifications (automatically) ?
  • 8. Research Question I Examples: 1. a) Power Tags I Android 1. b) Power Tags II Mobile Google 2. a) Power Tags I Web 2.0 2. b) Power Tags II Tools Social Blog Socialsoftware Bookmarks How to build social classifications (automatically) ? Community Tagging Web AJAX online association related term Google RT association related term mobile RT Android relation descriptor set hierarchy broader term web BT meronymy narrower term partitive blog NTP meronymy narrower term partitive bookmarks NTP meronymy narrower term partitive tagging NTP meronymy narrower term partitive community NTP meronymy narrower term partitive ajax NTP association related term online RT synonymy used for Socialsoftware UF Web 2.0 relation descriptor set
  • 9. Assumption II Power Tags are most relevant tags To build social classifications based on Power Tags an important precondition must be fulfilled: Power Tags ARE the most relevant tags for a resource Problem: relevance judgments as well as tagging behaviour are highly subjective and error-prone (regarding spelling etc.) Is the collective intelligence of users capable of ¡°ironing out¡± too personal and erroneous tags so that all users are satisfied with high-frequent tags?
  • 10. Method II Power Tags are most relevant tags Investigation of 30 resources downloaded from delicious in February 2010 Participants: 20 students of Information Science at the HHU D¨¹sseldorf All resources tagged with ¡°folksonomy¡± and tagged from at least 100 users To guarantee that students are technical able to judge relevance of tags To guarantee that broad tag distributions can be used as test sample User evaluation Tag is relevant for resource = indicated with 1 Tag is not relevant for resource = indicated with 0 Students had access to resource Students did not know the delicious-rank of the tags Relevance distribution of tags for every resource by student judgments
  • 11. Research Question II Are Power Tags most relevant for a resource? Determination of relevance: 50% and more of students judged tag as relevant Extraction of Top 10-delicious-tags How many students called these Top 10-tags relevant? Calculation of relative frequency of students relevance judgments ? Pearson ¡Ö 0,49 N = 30
  • 12. Research Question II Are Power Tags most relevant for a resource? Result: only the first two tags are relevant Strong indication for Power Tags Problems in relevance judgments Bias to german tags No unification of spelling variants ? solution: tag gardening (NLP) No combination of phrase tags
  • 13. Assumption III Tag distributions on resource level become stable Studies showed that the shape of tag distributions remains stable after reaching a particular number of tags and users Kipp & Campbell (2006) Maarek et al. (2006) Halpin, Robu, & Shepherd (2007) Maass, Kowatsch, & M¨¹nster (2007) Maier & Thalmann (2007)
  • 14. Assumption III Tag distributions on resource level become stable If this assumption is true and ¡°stable¡± is considered as No rank permutation of tags appear anymore Relative number of tags does not change anymore it means that ¡­ Power Tags I and II are like controlled vocabulary for a resource Users gained consenus in describing and tagging the resource ¨C visualized in Power Tags Tags in Long Tail of distribution may be synonyms, tags with typing errors, narrower concepts, etc.
  • 15. Open Research Question III When do tag distributions become stable? To automate classification processes we need to know after which number of tagging users a tag distribution remains stable and when no changes in the ranking of tags appear anymore After that we can extract Power Tags for social classification for the particular resource
  • 16. Open Research Question & Method III When do tag distributions become stable? Comparison of tag distribution with n users and final tag distribution (downloaded at a point in time) Calculation of relative frequency of every tag rel. freq (t 1 ¡­ t n ) for particular user numbers Calculation of average distance between final tag distribution and tag distribution with n users Subtraction of ¡Ærel. freq (t n ,fd) of final distribution and ¡Ærel. freq (t n ,td) of tag distribution with n users Stability achieved when ¡Æ rel. freq (t n ,fd) - ¡Ærel. freq (t n ,td) < threshold value
  • 17. Conclusion Social Classification can be based on folksonomies ¨C Power Tags are concept candidates Extraction of Power Tags I and II pairs can be carried out automatically Determination of the relationship inherent in tag pairs requires intellectual processing Power Tags are most relevant tags Relevance of tags can be enhanced through unification and combination of similar tags (here: not synonyms but spelling variants) ? tag gardening Ongoing research: when do tag distributions become stable?
  • 18. Conclusion What type of tag distribution ? Tag distribution stable? Extraction of Power Tags I & II Pairs of relevant Power Tags Candidate vocabulary Definition of concepts and of semantic relations Intellectual structuring Social knowledge organization system Automatic processing Intellectual processing
  • 19. Comments? Questions? Isabella Peters: isabella.peters@uni-duesseldorf.de Greetings from D¨¹sseldorf! This presentation is available on ºÝºÝߣShare: http://www.slideshare.net/isabellapeters.
  • 20. References Halpin, H., Robu, V. and Shepherd, H. (2007): The Complex Dynamics of Collaborative Tagging. In: Carey L. Williamson, C. L., Zurko, M. E., Patel-Schneider, P. F. and Shenoy, P. J. (Eds.): Proceedings of the 16th International WWW Conference, Ban, Alberta, Canada. ACM, New York, 211-220. Kipp, M., & Campbell, D. (2006). Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices. In Proceedings of the 17th Annual Meeting of the American Society for Information Science and Technology, Austin, Texas, USA . Maarek, Y., Marnasse, N., Navon, Y., & Soroka, V. (2006). Tagging the Physical World. In Proceedings of the Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, Scotland . Maass, W., Kowatsch, T., & M¨¹nster, T. (2007). Vocabulary Patterns in Free-for-all Collaborative Indexing Systems. In Proceedings of International Workshop on Emergent Semantics and Ontology Evolution, Busan, Korea (pp. 45¨C57). Maier, R., & Thalmann, S. (2007). Kollaboratives Tagging zur inhaltlichen Beschreibung von Lern- und Wissensressourcen. In R. Tolksdorf & J. Freytag (Eds.), Proceedings of XML Tage, Berlin, Germany, Proceedings of XML Tage, Berlin, Germany (pp. 75¨C86). Berlin: Freie Universit?t Berlin. Peters, I. (2009). Folksonomies: Indexing and Retrieval in Web 2.0. Berlin: De Gruyter, Saur. Peters, I., & Stock, W. G. (2010). &quot;Power Tags&quot; in Information Retrieval. Library Hi Tech, 28(1), 81-93. Peters, I., & Weller, K. (2008). Tag Gardening for Folksonomy Enrichment and Maintenance. Webology, 5(3), Article 58, from http://www.webology.ir/2008/ v5n3/a58.html. Stock, W.G. (2006). On Relevance Distributions. Journal of the American Society for Information Science and Technology , 57(8), 1126-1129.