In this work we describe an approach at multi-objective recommender system evaluation based on a previously introduced 3D benchmarking model. The benchmarking model takes user-centric, business-centric and technical constraints into consideration in order to provide a means of comparison of recommender algorithms in similar scenarios. We present a comparison of three recommendation algorithms deployed in a user study using this 3D model and compare to standard evaluation methods. The proposed approach simplifies benchmarking of recommender systems and allows for simple multi-objective comparisons.
1 of 1
Download to read offline
More Related Content
A 3D Approach to Recommender System Evaluation
1. Alan Said, Brijnesh J. Jain, Sahin Albayrak
{alan, jain, sahin}@dai-lab.de
CSCW 2013 ¨C San Antonio, TX, USA
Traditional recommender system evaluation only measures one In each scenario, different concepts have different
type of quality, e.g. recommendation accuracy or rating prediction importance.
error.
We propose to evaluate and benchmark additional We represent the quality of an algorithm as a function E
recommendation qualities: - a vector of cost functions:
? User Requirements ?
? recommendation accuracy ? ? = ?1 ? , ¡ , ? ? ?
Business Models
? perceived quality, etc.
? Business Values In order to allow for simple comparison, we formulate
? Retention the utility function as:
? Churn, etc.
? Technical Constraints ? ? = ??? ? = ?? ?? ?
? Scalability User Requirements ?
? Speed, etc. where w is the vector of weights defining the
importance of each axis. The resulting value represents
By defining a recommendation scenario, each of the three factors the quality of the recommendation algorithm in the
can be represented by a quality important in the specific use case. defined use case.
We conducted a movie recommendation user study with
132 users providing feedback on 3 recommendation algo-
rithms. Each user rated a number of movies and got 10
recommendations provided by one of the 3 algorithms.
The algorithm were tuned to provide traditional recom-
mendations, diverse recommendations, or random
recommendations respectively.
Users were asked if they would watch the recommended movies (user requirement) and whether they would consider using the
system again (business value). The technical constraint is represented by the time the algorithm took to recommend movies.
The results of the user study, shown with different weights, e.g.
when all three axis are similarly weighted, when the user
requirements are more important, when the business values are
more important and finally when the technical constraints are more
important than the other values.
We presented a three dimensional model for evaluation Further explanation of the 3D
of recommender systems taking user-centric values, evaluation concept[RUE¡®12, Said et al. 2012]
technical constraints and business values into
consideration. The model simplifies the evaluation and
benchmarking of recommendation algorithms in Poster abstract [CSCW¡®13, Said
predefined scenarios, e.g. where different qualities of et al. 2013b]
algorithms are sought for.
We evaluated the model through a user study User-Centric Evaluation of a K-Furthest
comparing 3 different recommendation algorithms and Neighbor Collaborative Filtering
presented different interpretations of the obtained Recommender Algorithm
qualities. [CSCW¡¯13, Said et al. 2013a.]
Presentation: Wednesday Feb 27, 10AM. Track 4.
Technische Universit?t Berlin www.dai-lab.de