際際滷

際際滷Share a Scribd company logo
Denis studying/working to be a faculty/researcher(Denis Parra || Denis Parra-Santander)PhD Studenthttp://www.sis.pitt.edu/~dparra/1March 18th 2011PAWS Lab  School of information Sciences  University of Pittsburgh
What is this presentation about?A short introduction of myselfA description of my research interests and what I have been doing about it in the latest years2
I.1 Where are you from?I am from Chile, a country that looks like a chile pepper, but, paradoxically, people dont eat much spicy food.Chile  [red hot chile pepper]      &&      Chile  M辿xico3
I.2 Are youfrom Santiago, the capital?Good try. One third of the 16 million Chileans lives in Santiago. But Chile is a looong country, in the north is hot and dry, in the south is very cold. I live in Valdivia, a city with rainy weather.Very Hot!Here I Live!ValdiviaVery Cold!4
I.3 Which activities do you like to do?I like playing tennis, running & rowingI like writing poetry. Check some poems herein Spanish (translated to English)I like reading novels, my favorite authors are J. L. Borges, Fyodor Dostoyevsky & James Joyce (right now Im reading a Roberto Bola単os novel)I like listening to music, from Blues to Lady Gaga, passing by Pink Floyd, Radiohead and Los Jaivas.I like watching movies like A Clockwork Orange by S. Kubrick and Underground by E. Kusturica. I also like surrealistic movies like The Holy Mountain by Alejandro Jodorowsky.5
I.4 OK, but now lets talk about work(1997 - 2002) I have BS in Engineering with emphasis in Informatics from Universidad Austral de Chile. This is a 6 year program, my undergrad thesis was titled SPORAS: An Adaptive Web Platform based on a Multiagent System and Ontologies (my first link to Dr. Brusilovskys field, Adaptive Hypermedia)Then, I worked in several projects of e-learning, developing on Open Source LMS such as Dokeos and Moodle (2003-2004) later on, I worked as IT Manager and consultant for an aquaculture company, Aqua Cards, in the South of Chile (2005-2007)I was also teaching OOP (Java), Matlab, and Introduction to Software Engineering (2004, 2006-2007)In 2007 I co-founded a company, Perceptum TI.6
I.5  and what about research?In 2008 I started the PhD program and I joined the PAWS lab (lead by Dr. Brusilovsky) so here is where this presentation startsTag-based recommendationsSpreading Activation for recommender systemsRelated projectsCourseAgentTagTheMapConference NavigatorLatent CommunitiesThis PresentationWalk the Talk: Mapping explicit 7
I.6.1 Tag-based recommendationsMain topic: Lack of ratings in most items of many systems pushes to look for alternatives to apply user and item-based Collaborative Filtering. We explore 2 variants: neighbor-weighted and tag-based BM25.Presented a workshop paper in HT09, p1Presented a short-paper (poster) at Recsys 09, p2Presented a short-paper at WI 09, p38
I.6.2 Spreading activationPresented a paper in a Workshop of Recsys 2009, p4Look for a way to apply Spreading activation for recommendations in order to:Make use of the multidimensional network structure of Folksonomies (users, items, tags)Find an scalable algorithm (compared to state-of-the art FolkRank, SVD and LDA-based) that makes use of local topology/neighborhood9
I.6.3. Related Projects10
Part II: so finallyThis project is based on the work of my internship at Telef坦nica Research (Barcelona, Spain) in the Summer of 2010Paper submitted to UMAP 2011: Walk the TalkAnalyzing the relation between implicit and explicit feedback for preference elicitation(I am co-author with Dr. Xavier Amatriain)11
II.1 Introduction (1/2)Explicit feedback: scarcity (people are not especially eager to rate)Implicit feedback: Is less scarce, but (Hu et al., 2008)Theres no negative feedbackNoisyPreference v/s ConfidenceLack of evaluation metrics12
II.1 Introduction (2/2)Which variables better account for the amount of times a user listens to online albums? Is it possible to map implicit behavior to explicit preference (ratings)?Study with Last.fm users: Part I: demographics and online music consumptionPart II: Rating 100 albums collected from their last.fm user profile13
II.2 About last.fm14
II.2.1 Survey ScreenshotsRequirements: 18 y.o.,  scrobblings > 500015
II.2.2 Survey Part IPre-req: 18 years old & 5,000 min playcount (scrobblings)# Users: 151 users started, 127 completed, 114 after filtering outliers.82% were male and 18% were female. From 23 different countries, main were Spain (25 users), U.S. (15 users), and UK (16 users).80% used 20 or more hours per week of internet. 50% of users listening to music for over 20 hours per week.9% did not attend music concerts. 30% went to 11 or more concerts a year.35% said that they only read music magazines or blogs sometimes, but 20% did it every week.50% of our subjects admitted rating music online never or seldom.45% of our subjects said they bought 1 to 10 physical records a year. However, a non-negligible 18% said they did not buy any.35% of our subjects report never buy music online, 8% say they do it once a month or more.14% preferred to listen to single tracks while over 45% preferred listening to full albums. The other 40% reported listening to music either way.16
II.2.3 Survey Part IIFor item (album) sampling, we accounted forImplicit Feedback (IF): playcount for a user on a given item. Changed to scale [1-3], 3 means being more listened to.Global Popularity (GP): global playcount for all users on a given item [1-3]. Changed to scale [1-3], 3 means being more listened to. Recentness (R) : time ellapsed since user played a given item. Changed to scale [1-3], 3 means being listened to more recently.17
II.3 General AnalysisInitial assumption: Rating and IF (# playcount) must be strongly correlated.18
II.3.1 Distribution of ratingsAverage rating:Considering 0s:3.206316Not considering 0s:3.61114419
II.3.2 Implicit Feedback 51020
II.3.3 Recentness50121
II.3.4 Global  Popularity50122
Effect of Track or CD50123
II.3.5 General Analysis - FindingsWe see strong positive correlation between ratings and implicit feedbackWe see some level of  positive correlation between ratings and recentnessWe dont expect a significant relations between ratings and global popularity.On demographic data: Just listening to track or album shows a significant effect (using ANOVA)24
II.4 Regression AnalysisIncluding Recentness increases R2 in more than 10% [ 1  -> 2]Including GP increases R2, not much compared to RE + IF [ 1 -> 3]Not Including GP, but including interaction between IF and RE improves the variance of the DV explained by the regression model. [ 2 -> 4 ]25
II.4.1 Regression AnalysisWe tested conclusions of regression analysis by predicting the score, using RMSE and 10-fold cross validationsResults of regression analysis are supported.26
II.4.2 Regression AnalysisIncluding track or CDIncluding this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model27
II.5 ConclusionsUsing a linear model, Implicit feedback and recentness can help to predict explicit feedback (playcount)Global popularity doesnt show a significant improvement in the prediction task (discussion)Our model can help to relate implicit and explicit feedback, helping to evaluate and compare explicit and implicit recommender systems.Ongoing Work?28
THANKSfor spending your time listening to this talk Questions? dap89@pitt.edu29
Survey part I results30
Graphics comparing % of ratings given 2 variables31

More Related Content

Currents steps to be a researcher and faculty

  • 1. Denis studying/working to be a faculty/researcher(Denis Parra || Denis Parra-Santander)PhD Studenthttp://www.sis.pitt.edu/~dparra/1March 18th 2011PAWS Lab School of information Sciences University of Pittsburgh
  • 2. What is this presentation about?A short introduction of myselfA description of my research interests and what I have been doing about it in the latest years2
  • 3. I.1 Where are you from?I am from Chile, a country that looks like a chile pepper, but, paradoxically, people dont eat much spicy food.Chile [red hot chile pepper] && Chile M辿xico3
  • 4. I.2 Are youfrom Santiago, the capital?Good try. One third of the 16 million Chileans lives in Santiago. But Chile is a looong country, in the north is hot and dry, in the south is very cold. I live in Valdivia, a city with rainy weather.Very Hot!Here I Live!ValdiviaVery Cold!4
  • 5. I.3 Which activities do you like to do?I like playing tennis, running & rowingI like writing poetry. Check some poems herein Spanish (translated to English)I like reading novels, my favorite authors are J. L. Borges, Fyodor Dostoyevsky & James Joyce (right now Im reading a Roberto Bola単os novel)I like listening to music, from Blues to Lady Gaga, passing by Pink Floyd, Radiohead and Los Jaivas.I like watching movies like A Clockwork Orange by S. Kubrick and Underground by E. Kusturica. I also like surrealistic movies like The Holy Mountain by Alejandro Jodorowsky.5
  • 6. I.4 OK, but now lets talk about work(1997 - 2002) I have BS in Engineering with emphasis in Informatics from Universidad Austral de Chile. This is a 6 year program, my undergrad thesis was titled SPORAS: An Adaptive Web Platform based on a Multiagent System and Ontologies (my first link to Dr. Brusilovskys field, Adaptive Hypermedia)Then, I worked in several projects of e-learning, developing on Open Source LMS such as Dokeos and Moodle (2003-2004) later on, I worked as IT Manager and consultant for an aquaculture company, Aqua Cards, in the South of Chile (2005-2007)I was also teaching OOP (Java), Matlab, and Introduction to Software Engineering (2004, 2006-2007)In 2007 I co-founded a company, Perceptum TI.6
  • 7. I.5 and what about research?In 2008 I started the PhD program and I joined the PAWS lab (lead by Dr. Brusilovsky) so here is where this presentation startsTag-based recommendationsSpreading Activation for recommender systemsRelated projectsCourseAgentTagTheMapConference NavigatorLatent CommunitiesThis PresentationWalk the Talk: Mapping explicit 7
  • 8. I.6.1 Tag-based recommendationsMain topic: Lack of ratings in most items of many systems pushes to look for alternatives to apply user and item-based Collaborative Filtering. We explore 2 variants: neighbor-weighted and tag-based BM25.Presented a workshop paper in HT09, p1Presented a short-paper (poster) at Recsys 09, p2Presented a short-paper at WI 09, p38
  • 9. I.6.2 Spreading activationPresented a paper in a Workshop of Recsys 2009, p4Look for a way to apply Spreading activation for recommendations in order to:Make use of the multidimensional network structure of Folksonomies (users, items, tags)Find an scalable algorithm (compared to state-of-the art FolkRank, SVD and LDA-based) that makes use of local topology/neighborhood9
  • 11. Part II: so finallyThis project is based on the work of my internship at Telef坦nica Research (Barcelona, Spain) in the Summer of 2010Paper submitted to UMAP 2011: Walk the TalkAnalyzing the relation between implicit and explicit feedback for preference elicitation(I am co-author with Dr. Xavier Amatriain)11
  • 12. II.1 Introduction (1/2)Explicit feedback: scarcity (people are not especially eager to rate)Implicit feedback: Is less scarce, but (Hu et al., 2008)Theres no negative feedbackNoisyPreference v/s ConfidenceLack of evaluation metrics12
  • 13. II.1 Introduction (2/2)Which variables better account for the amount of times a user listens to online albums? Is it possible to map implicit behavior to explicit preference (ratings)?Study with Last.fm users: Part I: demographics and online music consumptionPart II: Rating 100 albums collected from their last.fm user profile13
  • 15. II.2.1 Survey ScreenshotsRequirements: 18 y.o., scrobblings > 500015
  • 16. II.2.2 Survey Part IPre-req: 18 years old & 5,000 min playcount (scrobblings)# Users: 151 users started, 127 completed, 114 after filtering outliers.82% were male and 18% were female. From 23 different countries, main were Spain (25 users), U.S. (15 users), and UK (16 users).80% used 20 or more hours per week of internet. 50% of users listening to music for over 20 hours per week.9% did not attend music concerts. 30% went to 11 or more concerts a year.35% said that they only read music magazines or blogs sometimes, but 20% did it every week.50% of our subjects admitted rating music online never or seldom.45% of our subjects said they bought 1 to 10 physical records a year. However, a non-negligible 18% said they did not buy any.35% of our subjects report never buy music online, 8% say they do it once a month or more.14% preferred to listen to single tracks while over 45% preferred listening to full albums. The other 40% reported listening to music either way.16
  • 17. II.2.3 Survey Part IIFor item (album) sampling, we accounted forImplicit Feedback (IF): playcount for a user on a given item. Changed to scale [1-3], 3 means being more listened to.Global Popularity (GP): global playcount for all users on a given item [1-3]. Changed to scale [1-3], 3 means being more listened to. Recentness (R) : time ellapsed since user played a given item. Changed to scale [1-3], 3 means being listened to more recently.17
  • 18. II.3 General AnalysisInitial assumption: Rating and IF (# playcount) must be strongly correlated.18
  • 19. II.3.1 Distribution of ratingsAverage rating:Considering 0s:3.206316Not considering 0s:3.61114419
  • 22. II.3.4 Global Popularity50122
  • 23. Effect of Track or CD50123
  • 24. II.3.5 General Analysis - FindingsWe see strong positive correlation between ratings and implicit feedbackWe see some level of positive correlation between ratings and recentnessWe dont expect a significant relations between ratings and global popularity.On demographic data: Just listening to track or album shows a significant effect (using ANOVA)24
  • 25. II.4 Regression AnalysisIncluding Recentness increases R2 in more than 10% [ 1 -> 2]Including GP increases R2, not much compared to RE + IF [ 1 -> 3]Not Including GP, but including interaction between IF and RE improves the variance of the DV explained by the regression model. [ 2 -> 4 ]25
  • 26. II.4.1 Regression AnalysisWe tested conclusions of regression analysis by predicting the score, using RMSE and 10-fold cross validationsResults of regression analysis are supported.26
  • 27. II.4.2 Regression AnalysisIncluding track or CDIncluding this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model27
  • 28. II.5 ConclusionsUsing a linear model, Implicit feedback and recentness can help to predict explicit feedback (playcount)Global popularity doesnt show a significant improvement in the prediction task (discussion)Our model can help to relate implicit and explicit feedback, helping to evaluate and compare explicit and implicit recommender systems.Ongoing Work?28
  • 29. THANKSfor spending your time listening to this talk Questions? dap89@pitt.edu29
  • 30. Survey part I results30
  • 31. Graphics comparing % of ratings given 2 variables31