ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Cold Start Context-Based Hotel
     Recommender System
Asher Levi, Osnat (Ossi) Mokryn   Christophe Diot, Nina Taft
Hotel Domain
? A user cold start problem
? Contextual information
? Domain data (Venere, TripAdvisor)
 ? Metadata (name, price, location)
 ? Reviews ¨C anonymous
   ? Text, trip intent, nationality
 ? Ratings
   ? Over 87% of the ratings are in the range of [3-5]
 ? 3800 hotels, and 140000 reviews


                                       2
Can you guess ratings from reading reviews?

? Mechanical Turk workers estimations.
? 50 reviews, 3715 estimations
    1?                        2?                            3?
    Rate Difference     Average Difference      Count

The hotel was really dirty, the room was small, the (39.7%) was
    Estimation > Rate              4                 location
bad but the staff was great¡­
    Estimation < Rate                             (60.3%)

    Total                       38            3715 (100%)




                                  3
In a Nutshell
? We know that:
 ? Users are generous with the star ratings while
    Can we couple text analysis
   expressing their real opinion in writing
    and user context to yield a
 ? Previous visits might have different intents
  ? in different context a user might rate the same
    better recommendation?
    hotel differently
? Do the context groups have different
  needs?
 ? Can we identify them?
                           4
Common Traits
? A trait in psychology is a basic characteristic of a
  person
 ? Introvert vs. extravert
? Common traits
 ? Chinese year of birth determines a persons¡¯ traits ¨C for a group
   of people




                                 5
We defined common traits in text




                6
Feature weight
? For each feature we assign a weight that
  reflects its importance for each context group.




                         7
Common Traits
? Examples of common traits per group:
 ?   Single traveller: wifi, tv, price, supermarket.
 ?   Family: air condition, car, space, shuttle, breakfast.
 ?   Group: bar, money, bus stop, shopping, party.
 ?   Couple: coffee, view, balcony, breakfast.
 ?   Business: Internet, park, bar, shopping.




                                8
User Preferences
? Preferences for different hotel aspects
 ? Room, Location, Service etc.
? Cluster features that relate to each aspect
 ? Unsupervised Community Detection - Spin Glass




                                  9
Spin Glass Communities

  Number of communities is determined by the algorithm
                                  Room
         Location


Communities sizes differ,Facilities are also determined by algorithm
                           and
                                             Experience

           Service
                               Food
BUILDING
            A PERSONALIZED
              HOTEL SCORE



9/13/2012         11
Text reviews                                                wordnet



                      Assign weights to        Assign weights to
Preprocessing         features for each           features per
                                                                               Cluster hotels
                                                                            features to aspects
                                                                                                         Build opinion lexicon
                                                                                                           with orientation
                            intent                 nationality




                                 User                          User                        User
        User Input:             intent                      nationality                 preferences



Building                Select relevant
                      feature weight for
                                                    Select relevant
                                                  feature weight for
                                                                                 For each aspect, take
                                                                              features in that cluster and
personalized                intent                    nationality                    assign weight

score
                                                   Build feature weight


                                           Give semantic orientation for feature

                                               Build sentence, review score


                                                  Build final hotel score


                       Output is ranked order list of hotels
User¡¯s Hotel Score
? User select
 ? Purpose of the trip
 ? Nationality
 ? Aspect preference




                         13
Feature weight Based Scoring




             14
Example
  Bathroom Weight = 1
                    4
                    2        Bathroom Weight = 1
                                               2


Alice                        Bob



        Bathroom

         Room                       Location




                        15
Hotel Orientation Score




           16
Bias Adjustment




       17
Hotel Score




     18
Validation
? Verify the usefulness of nationality and bias
 ? Queries to the system with the tested parameter and without it
 ? Number of queries executed was 2500
 ? Calculate the distance for each query result (Jaccard distance)




             Parameter         Top 10            Top 20
        Nationality            16.6%              15%
        Bias Score              9%                8%



                                  19
Evaluation
? Human evaluation

? We present the user a list of six hotels
 ? Recommendation from our system
 ? Top rated hotels from Tripadvisor
 ? Random order


? We obtained 150 evaluations


                                   20
Evaluation

? For each hotel in the results the user answered:




                            21
Evaluation Results
   Would you select this hotel?




                  22
Evaluation Results
How well is this recommendation matching your expectations?




                              23
Conclusions
? Mechanical Turk experiment show that text
  caries more information then ratings
? Common traits can be found by pre-processing
  large samples of text
? With the use of traits we improved
  recommendations
? Future uses:
 ? Can group traits help identify whether an individual belongs to a
   group?
 ? Can a typical user per product be identified?

                                24
Cold Start Context-Based Hotel
    Recommender System




  Asher    Ossi   Christophe   Nina



      Thanks! Questions?

More Related Content

Cold Start Context Aware Hotel Recommender System

  • 1. Cold Start Context-Based Hotel Recommender System Asher Levi, Osnat (Ossi) Mokryn Christophe Diot, Nina Taft
  • 2. Hotel Domain ? A user cold start problem ? Contextual information ? Domain data (Venere, TripAdvisor) ? Metadata (name, price, location) ? Reviews ¨C anonymous ? Text, trip intent, nationality ? Ratings ? Over 87% of the ratings are in the range of [3-5] ? 3800 hotels, and 140000 reviews 2
  • 3. Can you guess ratings from reading reviews? ? Mechanical Turk workers estimations. ? 50 reviews, 3715 estimations 1? 2? 3? Rate Difference Average Difference Count The hotel was really dirty, the room was small, the (39.7%) was Estimation > Rate 4 location bad but the staff was great¡­ Estimation < Rate (60.3%) Total 38 3715 (100%) 3
  • 4. In a Nutshell ? We know that: ? Users are generous with the star ratings while Can we couple text analysis expressing their real opinion in writing and user context to yield a ? Previous visits might have different intents ? in different context a user might rate the same better recommendation? hotel differently ? Do the context groups have different needs? ? Can we identify them? 4
  • 5. Common Traits ? A trait in psychology is a basic characteristic of a person ? Introvert vs. extravert ? Common traits ? Chinese year of birth determines a persons¡¯ traits ¨C for a group of people 5
  • 6. We defined common traits in text 6
  • 7. Feature weight ? For each feature we assign a weight that reflects its importance for each context group. 7
  • 8. Common Traits ? Examples of common traits per group: ? Single traveller: wifi, tv, price, supermarket. ? Family: air condition, car, space, shuttle, breakfast. ? Group: bar, money, bus stop, shopping, party. ? Couple: coffee, view, balcony, breakfast. ? Business: Internet, park, bar, shopping. 8
  • 9. User Preferences ? Preferences for different hotel aspects ? Room, Location, Service etc. ? Cluster features that relate to each aspect ? Unsupervised Community Detection - Spin Glass 9
  • 10. Spin Glass Communities Number of communities is determined by the algorithm Room Location Communities sizes differ,Facilities are also determined by algorithm and Experience Service Food
  • 11. BUILDING A PERSONALIZED HOTEL SCORE 9/13/2012 11
  • 12. Text reviews wordnet Assign weights to Assign weights to Preprocessing features for each features per Cluster hotels features to aspects Build opinion lexicon with orientation intent nationality User User User User Input: intent nationality preferences Building Select relevant feature weight for Select relevant feature weight for For each aspect, take features in that cluster and personalized intent nationality assign weight score Build feature weight Give semantic orientation for feature Build sentence, review score Build final hotel score Output is ranked order list of hotels
  • 13. User¡¯s Hotel Score ? User select ? Purpose of the trip ? Nationality ? Aspect preference 13
  • 14. Feature weight Based Scoring 14
  • 15. Example Bathroom Weight = 1 4 2 Bathroom Weight = 1 2 Alice Bob Bathroom Room Location 15
  • 19. Validation ? Verify the usefulness of nationality and bias ? Queries to the system with the tested parameter and without it ? Number of queries executed was 2500 ? Calculate the distance for each query result (Jaccard distance) Parameter Top 10 Top 20 Nationality 16.6% 15% Bias Score 9% 8% 19
  • 20. Evaluation ? Human evaluation ? We present the user a list of six hotels ? Recommendation from our system ? Top rated hotels from Tripadvisor ? Random order ? We obtained 150 evaluations 20
  • 21. Evaluation ? For each hotel in the results the user answered: 21
  • 22. Evaluation Results Would you select this hotel? 22
  • 23. Evaluation Results How well is this recommendation matching your expectations? 23
  • 24. Conclusions ? Mechanical Turk experiment show that text caries more information then ratings ? Common traits can be found by pre-processing large samples of text ? With the use of traits we improved recommendations ? Future uses: ? Can group traits help identify whether an individual belongs to a group? ? Can a typical user per product be identified? 24
  • 25. Cold Start Context-Based Hotel Recommender System Asher Ossi Christophe Nina Thanks! Questions?

Editor's Notes

  1. This work was done with Ossi Mokryn, Christophe Diot And Nina Taft This is a paper about cold start context based hotel recommender systemSo we r building a hotel recommender systemSo why cold start?Unlike restaurants or movies for example, in the hotel domain we don¡¯t have enough ratings for hotels which leads to a user cold start problemWe r going to use the user context to overcome the user cold start problem
  2. So we saw that Users are generous with the star ratings while expressing their real opinion in writingThink about someone that go to a hotel with your family and kids, then in different occasion she will go to the same hotel for a business trip, Her needs for this two trips probably will be different, and the review the review that she will write might be different Its logic to think that user have different needs for different context, and we want to identify those needs or each context groupSo the main question here is Can we couple text analysis and user context to yield a better recommendation?
  3. So how r we going to do it? With the use of common traitsWe borrow this term from psychology, trait is a basic characteristic of a person for example there is a shay person, a kind person etc. Common traits are defined for groups, for example in the Chinese tradition theyear of birth determines the persons¡¯ traitsEveryone that was born this year, the year of dragon, share certain?characteristics, according to the tradition they will be for example brave and flexible
  4. We defined a common trait for each context group as the typical words that appear in the text written by users from this groupFor example when we look at reviews that was written by a single travelers we can see that they write a lot about wifi (a lot more that the other groups), so it means that wifi is very important feature for this group so wifi is a common trait for this context groupWe extracted the hotel features from the reviews text for example service, parking, pool etc.Then if the average appearance of a feature in all the context groups, minus the frequency of it for context group c is larger then the standard deviation, then this feature is a trait for this group
  5. In practice for each feature that we extracted we assign a weight that reflects its importance for each context groupThis is the weight function of feature f for context group cThe deviation is the average frequency of a feature in all the groups text minus the frequency of this feature for this groupWhen the deviation is higher then the standard deviation then the weight will be 1 + the deviation divided by the standard deviation
  6. To make our recommendation better we also need to know what are the user preferences for the hotel aspectsFirst lets define aspect, as I told u before we extracted features from the reviews¡¯ text, aspect is a group of features with the same orientation for example the hotel aspect Room includes the features bathroom, bed, room size, tv etc.To find all the features that belongs to an aspect We built a graph in which each node corresponds to a feature and the links are the pmi of the two features in the reviewsWe used unsupervised community detection algorithm called spin glass from the area of statistical mechanicsThe algorithm minimize the function that we see here, The basics of this function is to reword for internal links that exists and external links that doesn¡¯t exists, and penalize for internal links that doesn¡¯t exists and external link that existsThen each community that we find will correspond to an hotel aspect
  7. We found 6 different clusters, we labeled them manually, Each feature can only end up in one clusterNumber of communities and there size is determined by the algorithmThe identification of these clusters is important as it determine the particular hotel aspects that we chose to ask users their preferences forSuppose for example that a user species that room is of most importance to her. The room aspect identifies a large number of features that are often used to discuss things inside a hotel room; thus reviews in which these features occur frequently are more important to this user than a userwho cares more about location
  8. So now we r going to build a personalized hotel score
  9. Now I m going to do an overview on the system, in the preprocessing phase we extracted the features from the review text, then we assigned a weight for the features for each context group, we clustered the features to hotels aspects, and we built an opinion lexicon with orientationThat was the preprocessing Now, at run time, the user gives us her input Now think what happens when u go online to look for a hotel, first of all U filter the hotels with prices and other meta data, then u read the reviews about the hotels, when u read the reviews, imagine that u r wearing a personalized glasses and u look for the things that r important for u. we try to use this glasses to read the hotels reviews and we give hight weight to things that are important for u. So in real time we build a personalized hotel score and recommend u the hotels that match your needs
  10. When the user uses the system she select the purpose of the trip, nationality and her preferences on each one of the hotel aspect
  11. So how r we going to createthis glasses? with weight, we r going to give each feature a weight that reflect how important is this feature for the user. After the user selects her input we have three different weights for the features, weight for the trip intent, the nationality and the hotels¡¯ aspects preferences, We multiply those weights and we get a weight for this user for each featureThe weight are multiplied to allow differentiation of users within our groups
  12. Lets see an example, lets take the feature bathroom, the base weight for all features is 1, now Alice and Bob with the same nationality uses the system, and bathroom is a trait for their nationality, so the weight for bathroom for there nationality is 2, for Alice the room aspect is the most important one, and for Bob the location, the feature bathroom belongs to the aspect room, so bathroom are more important for Alice and the weight of bathroom for Alice will multiply to 4In this example we saw how the users input influence the feature weight
  13. The hotel orientation score is the average score of all the hotels¡¯ reviews. The review score is summarization all the sentences score and the sentence score is summarizing all the features score, The feature¡¯s score is the semantic orientation score multiply by the feature¡¯s weight for user uThe review score should reflect the relative importance of the given review for the user. Reviews that are both important and positive are deemed most relevant thereby receiving the highest scores
  14. The last parameter is the bias adjustment, it is based on the hotel ratings, it¡¯s a rating for hotel for a specific user, it contains the bias of the hotel, the bias of the purpose of the trip and the bias of the nationality of the user. For example if Germans rate best western hotel 0.5 more then all the other nationalities then the nationality bias of this hotel will be 0.5The calculation of the bias are pretty strait forward, lets look at the bias of nationality for hotel, for each review that was written on hotel h by nationality n, we reduce the average score of all the hotels and the hotel bias, we summarize all that and divide it by the number of reviewsThe hotel orientation score is between -40 to 80 and the bias term is between 0-5, Bias are included primarily to break ties, or to differentiate hotels when their scores are very close.
  15. The hotels score is the hotel orientation score plus the bias adjustment
  16. The effect of considering the intent of the trip is clear intuitively and from our results We want to farther verify the usefulness of nationality and bias parametersWe tested the affect of those parameters on the search resultsWe made queries to the system with the tested parameter and without it, then we counted how many different hotels were founded between the resultsWe can see that nationality context group affect 15-16% of the resultsAnd the bias score affect only 8-9% of the results and indeed the bias score break ties and differentiate between hotels that the orientation score is very close
  17. We used human evaluationWe implemented our system and made it available to the public for useThe user make a query and then We present the user a list of six hotels. Some are results from our system, and some are the top rated hotels from TripadvisorIn order to avoid biasing the users, the six hotels are presented in random order and thus the user is unaware of the source of the recommendationsWe obtained 150 evaluations
  18. For each hotel the users were asked 3 different questions, They needed to rate in a the scale of 1-5 vow well this recommendation match your needs, The second question was would u select this hotelAnd the last one was what was the main reason motivating your decision
  19. For more then 60% of the hotels recommended by our system users said that they will select this hotels compare to more then 50% from the start rating recommendationThe other important thing is that for almost 16% of the hotels recommended by our system users said they wouldn&apos;t go compare to more then 26% of the hotels in the star ratings recommendationU can clearly see that the users was more satisfied from the results from our system
  20. This is the ratings that the evaluators gave to the hotels that was recommended to themU can see that the hotels recommended by our method received more 4 and 5 ratings then those the other method, and similarly our recommended hotels received fewer 1 and 2 ratings than the other method.People were more satisfy from our system, and the approval rate is higher
  21. This work was done with Ossi Mokryn, Christophe Diot And Nina Taft This is a paper about cold start context based hotel recommender systemSo we r building a hotel recommender systemSo why cold start?Unlike restaurants or movies for example, in the hotel domain we don¡¯t have enough ratings for hotels which leads to a user cold start problemWe r going to use the user context to overcome the user cold start problem