ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Clustering Techniques for Collaborative Filtering and the Application to Venue Recommendation Manh Cuong Pham , Yiwei Cao, Ralf Klamma Information Systems and Database Technology RWTH Aachen, Germany Graz , Austria, September 01, 2010 I-KNOW 2010
Agenda Introduction Clustering techniques for collaborative filtering Case study: venue recommendation Data sets: DBLP and CiteSeerX User-based  Item-based  Conclusions and Outlook
Introduction Recommender systems: help users dealing with information overload Components of a recommender system [ Burke2002 ] Set of users, set of items (products) Implicit/explicit user rating on items Additional information:  trust, collaboration, etc. Algorithms for generating recommendations Recommendation techniques  [ Adomavicius and Tuzhilin 2005 ] Collaborative Filtering (CF)  [Breese et al. 1998 ] Memory-based algorithms: user-based, item-based  [Sarwar 2001] Model-based algorithms: Bayesian network  [ Breese1998 ] ;  Clustering  [ Ungar 1998 ] ; Rule-based  [ Sarwar2000 ] ; Machine learning on graphs  [Zhou 2005, 2008];  PLSA  [Hofmann 1999] ; Matrix factorization  [Koren 2009] Content-based recommendation  [Sarwar et al. 2001] Hybrid approaches  [Burke 2002]
Clustering and Collaborative Filtering Cluster 2 Cluster 1 item-based CF User clustering Item clustering item-based CF item-based CF Problems:  large-scale data; sparse rating matrix;  diversity of users and items Previous approaches:  Clustering based on ratings K-means, Metis, etc.  [Rashid 2006, Xue 2005, O’Connor 2001] Our approach Clustering based on additional information: relationships between users, items Improvement on both efficiency and accuracy x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Evaluation: Venue Recommendation Recommend venues (conferences, journals, workshops) to researchers User-based CF Populate user-item matrix using venue participation history Ratings: normalized venue publication counts User-clustering: co-authorship network Item-based CF Similarity between venues based on citation Similarity measure: cosine Venue clustering: similarity network
Data Sets DBLP (http://www.informatik.uni-trier.de/~ley/db/) 788,259 author’s names 1,226,412 publications 3,490 venues (conferences, workshops, journals) CiteSeerX (http://citeseerx.ist.psu.edu/) 7,385,652 publications (including publications in reference lists) 22,735,240 citations Over 4 million author’s names Combination Canopy clustering [ McCallum 2000 ] Result: 864,097 matched pairs  On average: venues cite 2306 and  are cited 2037 times
User-based CF: Author Clustering Data: DBLP  Perform 2 test cases for the years of 2005 and 2006  Clustering of co-authorship networks 2005s network: 478,108 nodes; 1,427,196 edges 2006s network: 544,601 nodes; 1,686,867 edges Prediction of the venue participation Clustering algorithm Density-based algorithm [Clauset  2004 ] Obtained modularity: 0.829 and 0.82 Cluster size distribution follows Power law
User-based CF: Performance Precisions for 1000 random chosen authors Precisions computed at 11 standard recall levels 0%, 10%,….,100% Results Clustering performs better Not significant improved Better efficiency Further improvement Different networks: citation Overlapping clustering
Item-based CF: Venue Network Creation and Clustering Knowledge network Aggregate bibliography coupling counts at venue level Undirected graph  G(V, E) , where  V : venues,  E : edges weighted by cosine similarity Threshold:  Clustering: density-based algorithm  [ Neuman 2004, Clauset 2004 ] Network visualization: force-directed paradigm [ Fruchterman 1991 ] Knowledge flow network  (for venue ranking, see  Pham & Klamma 2010 ) Aggregate bibliography coupling counts at venue level Threshold: citation counts >= 50 Domains from Microsoft Academic Search ( http://academic.research.microsoft.com/)
Knowledge Network: the Visualization
Knowledge Network: Clustering
Interdisciplinary Venues: Top Betweenness Centrality
High Prestige Series: Top PageRank
Conclusions and Future Research Clustering and recommender systems Advantage of using additional information for clustering Application of clustering for both user-based and item-based CF  Key issue: impact of the communities (cluster) on the quality of recommendations; non-overlapping communities vs. overlapping communities Outlook Further evaluation: trust networks clustering, paper and potential collaborator recommendation Datasets: Epinion, Last.fm, etc. Digital libraries in Web 2.0: Mendeley, ResearchGate, etc.

More Related Content

What's hot (20)

Mahout classification presentation
Mahout classification presentationMahout classification presentation
Mahout classification presentation
Naoki Nakatani
Ìý
Intro to Apache Mahout
Intro to Apache MahoutIntro to Apache Mahout
Intro to Apache Mahout
Grant Ingersoll
Ìý
Apache mahout
Apache mahoutApache mahout
Apache mahout
Puneet Gupta
Ìý
Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)
Jamshaid Ashraf
Ìý
Email Classification
Email ClassificationEmail Classification
Email Classification
Xi Chen
Ìý
Machine Learning with Apache Mahout
Machine Learning with Apache MahoutMachine Learning with Apache Mahout
Machine Learning with Apache Mahout
Daniel Glauser
Ìý
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Arjen de Vries
Ìý
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
Cataldo Musto
Ìý
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An Introduction
Varad Meru
Ìý
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
Georgian Micsa
Ìý
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
Aravindharamanan S
Ìý
Collaborative Filtering
Collaborative FilteringCollaborative Filtering
Collaborative Filtering
Tayfun Sen
Ìý
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the why
Korea Sdec
Ìý
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Marco Brambilla
Ìý
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)
Tariqul islam
Ìý
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
Victor Giannakouris
Ìý
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
Aladejubelo Oluwashina
Ìý
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Joaquin Delgado PhD.
Ìý
Heterogeneous data annotation
Heterogeneous data annotationHeterogeneous data annotation
Heterogeneous data annotation
Yomna Mahmoud Ibrahim Hassan
Ìý
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
aneeshabakharia
Ìý
Mahout classification presentation
Mahout classification presentationMahout classification presentation
Mahout classification presentation
Naoki Nakatani
Ìý
Intro to Apache Mahout
Intro to Apache MahoutIntro to Apache Mahout
Intro to Apache Mahout
Grant Ingersoll
Ìý
Apache mahout
Apache mahoutApache mahout
Apache mahout
Puneet Gupta
Ìý
Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)
Jamshaid Ashraf
Ìý
Email Classification
Email ClassificationEmail Classification
Email Classification
Xi Chen
Ìý
Machine Learning with Apache Mahout
Machine Learning with Apache MahoutMachine Learning with Apache Mahout
Machine Learning with Apache Mahout
Daniel Glauser
Ìý
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Arjen de Vries
Ìý
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
Cataldo Musto
Ìý
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An Introduction
Varad Meru
Ìý
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
Georgian Micsa
Ìý
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
Aravindharamanan S
Ìý
Collaborative Filtering
Collaborative FilteringCollaborative Filtering
Collaborative Filtering
Tayfun Sen
Ìý
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the why
Korea Sdec
Ìý
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Marco Brambilla
Ìý
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)
Tariqul islam
Ìý
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
Victor Giannakouris
Ìý
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
Aladejubelo Oluwashina
Ìý
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Joaquin Delgado PhD.
Ìý
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
aneeshabakharia
Ìý

Similar to Clustering Technique for Collaborative Filtering Recommendation and Application to Venue Recommendation (20)

You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...
Ralf Klamma
Ìý
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Future
feiwin
Ìý
Synthese Recommender System
Synthese Recommender SystemSynthese Recommender System
Synthese Recommender System
Andre Vellino
Ìý
Yoda an accurate and scalable web based recommendation systems
Yoda an accurate and scalable web based recommendation systemsYoda an accurate and scalable web based recommendation systems
Yoda an accurate and scalable web based recommendation systems
Aravindharamanan S
Ìý
Workflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to ReportingWorkflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to Reporting
Rayhan Ferdous
Ìý
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Xiaoyu Wang
Ìý
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
HODECE21
Ìý
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application Models
Marco Brambilla
Ìý
From federated to aggregated search
From federated to aggregated searchFrom federated to aggregated search
From federated to aggregated search
Mounia Lalmas-Roelleke
Ìý
Chi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_finalChi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_final
Nikolaos Tselios
Ìý
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
Ìý
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
guestcaef1d
Ìý
Data-driven Applications with conStruct
Data-driven Applications with conStructData-driven Applications with conStruct
Data-driven Applications with conStruct
Mike Bergman
Ìý
Machine learning for the Web:
Machine learning for the Web: Machine learning for the Web:
Machine learning for the Web:
butest
Ìý
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized Learning
Peter Brusilovsky
Ìý
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
Kalman Graffi
Ìý
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
Philip Bourne
Ìý
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
Vivien Bonazzi
Ìý
Ah.hypermedia gaf.poster
Ah.hypermedia gaf.posterAh.hypermedia gaf.poster
Ah.hypermedia gaf.poster
natematias
Ìý
Phd Defence 25 Jan09
Phd Defence 25 Jan09Phd Defence 25 Jan09
Phd Defence 25 Jan09
Bibhushan Jagannath
Ìý
You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...
Ralf Klamma
Ìý
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Future
feiwin
Ìý
Synthese Recommender System
Synthese Recommender SystemSynthese Recommender System
Synthese Recommender System
Andre Vellino
Ìý
Yoda an accurate and scalable web based recommendation systems
Yoda an accurate and scalable web based recommendation systemsYoda an accurate and scalable web based recommendation systems
Yoda an accurate and scalable web based recommendation systems
Aravindharamanan S
Ìý
Workflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to ReportingWorkflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to Reporting
Rayhan Ferdous
Ìý
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...Designing Guidelines for Visual Analytics System to Augment Organizational An...
Designing Guidelines for Visual Analytics System to Augment Organizational An...
Xiaoyu Wang
Ìý
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
HODECE21
Ìý
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application Models
Marco Brambilla
Ìý
From federated to aggregated search
From federated to aggregated searchFrom federated to aggregated search
From federated to aggregated search
Mounia Lalmas-Roelleke
Ìý
Chi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_finalChi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_final
Nikolaos Tselios
Ìý
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
Ìý
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
guestcaef1d
Ìý
Data-driven Applications with conStruct
Data-driven Applications with conStructData-driven Applications with conStruct
Data-driven Applications with conStruct
Mike Bergman
Ìý
Machine learning for the Web:
Machine learning for the Web: Machine learning for the Web:
Machine learning for the Web:
butest
Ìý
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized Learning
Peter Brusilovsky
Ìý
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
ACM NOTERE 2008 - Kalman Graffi - From Cells to Organisms - Long-Term Guarant...
Kalman Graffi
Ìý
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
Philip Bourne
Ìý
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
Vivien Bonazzi
Ìý
Ah.hypermedia gaf.poster
Ah.hypermedia gaf.posterAh.hypermedia gaf.poster
Ah.hypermedia gaf.poster
natematias
Ìý

Recently uploaded (20)

HTML Interview Questions PDF By ScholarHat
HTML Interview Questions PDF By ScholarHatHTML Interview Questions PDF By ScholarHat
HTML Interview Questions PDF By ScholarHat
Scholarhat
Ìý
Intellectual Honesty & Research Integrity.pptx
Intellectual Honesty & Research Integrity.pptxIntellectual Honesty & Research Integrity.pptx
Intellectual Honesty & Research Integrity.pptx
NidhiSharma495177
Ìý
Entity Framework Interview Questions PDF By ScholarHat
Entity Framework Interview Questions PDF By ScholarHatEntity Framework Interview Questions PDF By ScholarHat
Entity Framework Interview Questions PDF By ScholarHat
Scholarhat
Ìý
How to create security group category in Odoo 17
How to create security group category in Odoo 17How to create security group category in Odoo 17
How to create security group category in Odoo 17
Celine George
Ìý
ITI Turner Question Paper MCQ E-Book Free Download
ITI Turner Question Paper MCQ E-Book Free DownloadITI Turner Question Paper MCQ E-Book Free Download
ITI Turner Question Paper MCQ E-Book Free Download
SONU HEETSON
Ìý
The 2024 Survey of Community College Outcomes
The 2024 Survey of Community College OutcomesThe 2024 Survey of Community College Outcomes
The 2024 Survey of Community College Outcomes
Mebane Rash
Ìý
Discharge procedure and its types in hospital .pptx
Discharge procedure and its types in hospital .pptxDischarge procedure and its types in hospital .pptx
Discharge procedure and its types in hospital .pptx
PoojaSen20
Ìý
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptxGRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
AngellieMaeDoce
Ìý
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
Ìý
Admission Procedure and types in hospital pptx
Admission Procedure  and types in hospital pptxAdmission Procedure  and types in hospital pptx
Admission Procedure and types in hospital pptx
PoojaSen20
Ìý
Azure Administrator Interview Questions By ScholarHat
Azure Administrator Interview Questions By ScholarHatAzure Administrator Interview Questions By ScholarHat
Azure Administrator Interview Questions By ScholarHat
Scholarhat
Ìý
Azure Data Engineer Interview Questions By ScholarHat
Azure Data Engineer Interview Questions By ScholarHatAzure Data Engineer Interview Questions By ScholarHat
Azure Data Engineer Interview Questions By ScholarHat
Scholarhat
Ìý
Chapter 2. Strategic Management: Corporate Governance.pdf
Chapter 2. Strategic Management: Corporate Governance.pdfChapter 2. Strategic Management: Corporate Governance.pdf
Chapter 2. Strategic Management: Corporate Governance.pdf
Rommel Regala
Ìý
Research & Research Methods: Basic Concepts and Types.pptx
Research & Research Methods: Basic Concepts and Types.pptxResearch & Research Methods: Basic Concepts and Types.pptx
Research & Research Methods: Basic Concepts and Types.pptx
Dr. Sarita Anand
Ìý
How to Unblock Payment in Odoo 18 Accounting
How to Unblock Payment in Odoo 18 AccountingHow to Unblock Payment in Odoo 18 Accounting
How to Unblock Payment in Odoo 18 Accounting
Celine George
Ìý
Cyrus_Kelisha_SMM_PB1_2024-November.pptx
Cyrus_Kelisha_SMM_PB1_2024-November.pptxCyrus_Kelisha_SMM_PB1_2024-November.pptx
Cyrus_Kelisha_SMM_PB1_2024-November.pptx
KelishaCyrus
Ìý
Digital Tools with AI for e-Content Development.pptx
Digital Tools with AI for e-Content Development.pptxDigital Tools with AI for e-Content Development.pptx
Digital Tools with AI for e-Content Development.pptx
Dr. Sarita Anand
Ìý
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAHBISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
coacharyasetiyaki
Ìý
MELC: Follows ethical standards in writing related literature
MELC: Follows ethical standards in writing related literatureMELC: Follows ethical standards in writing related literature
MELC: Follows ethical standards in writing related literature
joverlynbalansag1
Ìý
DBMS Interview Questions PDF By ScholarHat
DBMS Interview Questions PDF By ScholarHatDBMS Interview Questions PDF By ScholarHat
DBMS Interview Questions PDF By ScholarHat
Scholarhat
Ìý
HTML Interview Questions PDF By ScholarHat
HTML Interview Questions PDF By ScholarHatHTML Interview Questions PDF By ScholarHat
HTML Interview Questions PDF By ScholarHat
Scholarhat
Ìý
Intellectual Honesty & Research Integrity.pptx
Intellectual Honesty & Research Integrity.pptxIntellectual Honesty & Research Integrity.pptx
Intellectual Honesty & Research Integrity.pptx
NidhiSharma495177
Ìý
Entity Framework Interview Questions PDF By ScholarHat
Entity Framework Interview Questions PDF By ScholarHatEntity Framework Interview Questions PDF By ScholarHat
Entity Framework Interview Questions PDF By ScholarHat
Scholarhat
Ìý
How to create security group category in Odoo 17
How to create security group category in Odoo 17How to create security group category in Odoo 17
How to create security group category in Odoo 17
Celine George
Ìý
ITI Turner Question Paper MCQ E-Book Free Download
ITI Turner Question Paper MCQ E-Book Free DownloadITI Turner Question Paper MCQ E-Book Free Download
ITI Turner Question Paper MCQ E-Book Free Download
SONU HEETSON
Ìý
The 2024 Survey of Community College Outcomes
The 2024 Survey of Community College OutcomesThe 2024 Survey of Community College Outcomes
The 2024 Survey of Community College Outcomes
Mebane Rash
Ìý
Discharge procedure and its types in hospital .pptx
Discharge procedure and its types in hospital .pptxDischarge procedure and its types in hospital .pptx
Discharge procedure and its types in hospital .pptx
PoojaSen20
Ìý
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptxGRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
GRADE-1-QUARTER 4-MATHEMATICS-WEEK-3.pptx
AngellieMaeDoce
Ìý
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
Ìý
Admission Procedure and types in hospital pptx
Admission Procedure  and types in hospital pptxAdmission Procedure  and types in hospital pptx
Admission Procedure and types in hospital pptx
PoojaSen20
Ìý
Azure Administrator Interview Questions By ScholarHat
Azure Administrator Interview Questions By ScholarHatAzure Administrator Interview Questions By ScholarHat
Azure Administrator Interview Questions By ScholarHat
Scholarhat
Ìý
Azure Data Engineer Interview Questions By ScholarHat
Azure Data Engineer Interview Questions By ScholarHatAzure Data Engineer Interview Questions By ScholarHat
Azure Data Engineer Interview Questions By ScholarHat
Scholarhat
Ìý
Chapter 2. Strategic Management: Corporate Governance.pdf
Chapter 2. Strategic Management: Corporate Governance.pdfChapter 2. Strategic Management: Corporate Governance.pdf
Chapter 2. Strategic Management: Corporate Governance.pdf
Rommel Regala
Ìý
Research & Research Methods: Basic Concepts and Types.pptx
Research & Research Methods: Basic Concepts and Types.pptxResearch & Research Methods: Basic Concepts and Types.pptx
Research & Research Methods: Basic Concepts and Types.pptx
Dr. Sarita Anand
Ìý
How to Unblock Payment in Odoo 18 Accounting
How to Unblock Payment in Odoo 18 AccountingHow to Unblock Payment in Odoo 18 Accounting
How to Unblock Payment in Odoo 18 Accounting
Celine George
Ìý
Cyrus_Kelisha_SMM_PB1_2024-November.pptx
Cyrus_Kelisha_SMM_PB1_2024-November.pptxCyrus_Kelisha_SMM_PB1_2024-November.pptx
Cyrus_Kelisha_SMM_PB1_2024-November.pptx
KelishaCyrus
Ìý
Digital Tools with AI for e-Content Development.pptx
Digital Tools with AI for e-Content Development.pptxDigital Tools with AI for e-Content Development.pptx
Digital Tools with AI for e-Content Development.pptx
Dr. Sarita Anand
Ìý
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAHBISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
BISNIS BERKAH BERANGKAT KE MEKKAH ISTIKMAL SYARIAH
coacharyasetiyaki
Ìý
MELC: Follows ethical standards in writing related literature
MELC: Follows ethical standards in writing related literatureMELC: Follows ethical standards in writing related literature
MELC: Follows ethical standards in writing related literature
joverlynbalansag1
Ìý
DBMS Interview Questions PDF By ScholarHat
DBMS Interview Questions PDF By ScholarHatDBMS Interview Questions PDF By ScholarHat
DBMS Interview Questions PDF By ScholarHat
Scholarhat
Ìý

Clustering Technique for Collaborative Filtering Recommendation and Application to Venue Recommendation

  • 1. Clustering Techniques for Collaborative Filtering and the Application to Venue Recommendation Manh Cuong Pham , Yiwei Cao, Ralf Klamma Information Systems and Database Technology RWTH Aachen, Germany Graz , Austria, September 01, 2010 I-KNOW 2010
  • 2. Agenda Introduction Clustering techniques for collaborative filtering Case study: venue recommendation Data sets: DBLP and CiteSeerX User-based Item-based Conclusions and Outlook
  • 3. Introduction Recommender systems: help users dealing with information overload Components of a recommender system [ Burke2002 ] Set of users, set of items (products) Implicit/explicit user rating on items Additional information: trust, collaboration, etc. Algorithms for generating recommendations Recommendation techniques [ Adomavicius and Tuzhilin 2005 ] Collaborative Filtering (CF) [Breese et al. 1998 ] Memory-based algorithms: user-based, item-based [Sarwar 2001] Model-based algorithms: Bayesian network [ Breese1998 ] ; Clustering [ Ungar 1998 ] ; Rule-based [ Sarwar2000 ] ; Machine learning on graphs [Zhou 2005, 2008]; PLSA [Hofmann 1999] ; Matrix factorization [Koren 2009] Content-based recommendation [Sarwar et al. 2001] Hybrid approaches [Burke 2002]
  • 4. Clustering and Collaborative Filtering Cluster 2 Cluster 1 item-based CF User clustering Item clustering item-based CF item-based CF Problems: large-scale data; sparse rating matrix; diversity of users and items Previous approaches: Clustering based on ratings K-means, Metis, etc. [Rashid 2006, Xue 2005, O’Connor 2001] Our approach Clustering based on additional information: relationships between users, items Improvement on both efficiency and accuracy x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
  • 5. Evaluation: Venue Recommendation Recommend venues (conferences, journals, workshops) to researchers User-based CF Populate user-item matrix using venue participation history Ratings: normalized venue publication counts User-clustering: co-authorship network Item-based CF Similarity between venues based on citation Similarity measure: cosine Venue clustering: similarity network
  • 6. Data Sets DBLP (http://www.informatik.uni-trier.de/~ley/db/) 788,259 author’s names 1,226,412 publications 3,490 venues (conferences, workshops, journals) CiteSeerX (http://citeseerx.ist.psu.edu/) 7,385,652 publications (including publications in reference lists) 22,735,240 citations Over 4 million author’s names Combination Canopy clustering [ McCallum 2000 ] Result: 864,097 matched pairs On average: venues cite 2306 and are cited 2037 times
  • 7. User-based CF: Author Clustering Data: DBLP Perform 2 test cases for the years of 2005 and 2006 Clustering of co-authorship networks 2005s network: 478,108 nodes; 1,427,196 edges 2006s network: 544,601 nodes; 1,686,867 edges Prediction of the venue participation Clustering algorithm Density-based algorithm [Clauset 2004 ] Obtained modularity: 0.829 and 0.82 Cluster size distribution follows Power law
  • 8. User-based CF: Performance Precisions for 1000 random chosen authors Precisions computed at 11 standard recall levels 0%, 10%,….,100% Results Clustering performs better Not significant improved Better efficiency Further improvement Different networks: citation Overlapping clustering
  • 9. Item-based CF: Venue Network Creation and Clustering Knowledge network Aggregate bibliography coupling counts at venue level Undirected graph G(V, E) , where V : venues, E : edges weighted by cosine similarity Threshold: Clustering: density-based algorithm [ Neuman 2004, Clauset 2004 ] Network visualization: force-directed paradigm [ Fruchterman 1991 ] Knowledge flow network (for venue ranking, see Pham & Klamma 2010 ) Aggregate bibliography coupling counts at venue level Threshold: citation counts >= 50 Domains from Microsoft Academic Search ( http://academic.research.microsoft.com/)
  • 10. Knowledge Network: the Visualization
  • 12. Interdisciplinary Venues: Top Betweenness Centrality
  • 13. High Prestige Series: Top PageRank
  • 14. Conclusions and Future Research Clustering and recommender systems Advantage of using additional information for clustering Application of clustering for both user-based and item-based CF Key issue: impact of the communities (cluster) on the quality of recommendations; non-overlapping communities vs. overlapping communities Outlook Further evaluation: trust networks clustering, paper and potential collaborator recommendation Datasets: Epinion, Last.fm, etc. Digital libraries in Web 2.0: Mendeley, ResearchGate, etc.

Editor's Notes

  • #3: Pham Manh Cuong