�ݺ�ߣ

Deconstructing Popularity Bias in Recommender
Systems: Origins, Impacts, and Mitigation
Amit Jaspal
Trust & Responsibility in Recommendation Systems, WSDM 2025

Amit’s Introduction
● Thank you for the opportunity to speak !
● Engineering Manager and Research Scientist at Meta leading ecommerce recommendations team
● Building recommender and information systems in Industry for the last 14 years
○ Ecommerce recommendations at Meta
○ Video recommendations at Meta
○ Ads recommendations at Meta
○ Newsfeed recommendations at Linkedin
○ Apace SOLR at Cloudera
○ Hurricane search engine in D.E.Shaw
● Research fellow at NCSA and TDIL Labs
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025

What is Popularity Bias?
● Popularity bias refers to the tendency of a recommender system to over-recommend popular items at the
expense of less popular ones. In other words, already-popular items get disproportionate exposure, while long-tail
items are under-represented.
● Not a unique problem to recommender systems, but dynamic nature of recommender system makes it worse.
● Examples of Popularity Bias in other domains
○ Academic Research/Citations
○ Financial Markets/Stock Trading
○ Book Publishing/Best Seller Lists
○ Hiring and Job Portals

Sources of Popularity Bias in Recommender Systems
● Inherent Audience Size Imbalance (Data Bias):
○ Some items are naturally more appealing to a broader audience.
○ Item popularity often follows a long-tail distribution inherently.
○ Even bias-free algorithms will see more interactions with these items.
● Model Bias (Algorithmic Bias):
○ Machine learning models learn patterns from training data, including existing popularity biases.
○ Collaborative filtering and similar methods tend to amplify popularity signals.
○ Models may over-generalize from popular item interactions, leading to biased predictions.
● Closed Feedback Loop (Systemic Bias):
○ Dynamic recommendation systems operate in a closed loop.
○ Recommendations influence user interactions, which become training data for future models.
○ This creates a feedback loop that can accumulate and exacerbate popularity bias over time.

Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation

Why Does Popularity Bias Matter?
● For Users:
○ Reduced novelty and serendipity - Recommendations become predictable and less engaging.
○ Limited personalization - May not discover items truly aligned with individual preferences, especially niche
interests.
○ Decreased user satisfaction and trust in the system over time.
● For Item Providers (Especially Long-Tail):
○ Reduced visibility and sales opportunities for less popular items.
○ Unfair competition - Popular items dominate, regardless of quality or relevance to specific users e.g click
baits
○ Item side cold start problem.
● System-Level:
○ Reinforcement loops - Bias can worsen over time due to feedback cycles.
○ System behaves suboptimally catering only to popular items on one side and users who are ok with
engagement w/ only popular items.

Measuring Popularity Bias
● Gini Coefficient
○ Statistical measure of inequality within a distribution, computed using Lorenz
Curve
○ Lorenz Curve is a graphical representation of inequality, showing the cumulative
distribution of a resource (e.g., wealth, recommendation exposure) across a
population.
○ Gini Index can be computed as the area between the Lorenz Curve and the Line
of Equality
● Recall breakdown by item set bucket e.g recall@k for head items, recall@k for tail items

Mitigation Strategies
● Key Mitigation Goals:
○ Promote long-tail item visibility.
○ Improve fairness and diversity.
○ Maintain or improve recommendation accuracy (or minimize accuracy loss).
● Categorization by Processing Stage:
○ Pre-processing: Modify training data before model training.
○ In-processing (Modeling): Integrate debiasing directly into the model training process.
○ Post-processing: Adjust recommendation lists after model prediction.

Mitigation Strategies - Pre & Post-processing
● Pre-processing
○ Data Sampling: Down-sample popular item interactions or up-sample long-tail item interactions.
○ Item Exclusion: Remove highly popular items from the training data or candidate pool (use with caution).
○ Balanced Dataset Creation: Aim for a more uniform distribution of item interactions in training data.
○ Data Augmentation: Enrich data with side information to provide more context beyond popularity.
● Post-processing
○ Re-scaling (Score Adjustment): Adjust predicted scores based on item popularity.
○ Re-ranking: Re-order the initial ranked list to promote less popular items.
○ Rank Aggregation / Slotting: Combine rankings from biased and debiased models.
○ Post-filtering: Remove top-k popular items from the final recommendation list.
○ False Positive Correction (FPC): Correct scores probabilistically based on past unclicked
recommendations

Mitigation Strategies - In-processing (Model-Level)
● Causal Inference Methods:
○ Counterfactual reasoning - Estimate recommendations without popularity influence.
○ Model ranking as cause-and-effect relationship to disentangle popularity and user preference
● Reducing Memorization
○ Remove ID features or add large dropouts
○ Metadata based feature to improve generalization
● Re-weighting Approaches:
○ Adjust item weights during training to balance popular and unpopular items.
○ Inverse Propensity Scoring (IPS) - Weight items inversely proportional to their popularity.
● Regularization-based Approaches:
○ Add regularization terms to the loss function to penalize popularity bias.
○ Encourage models to learn from less popular items.
○ Examples: Popularity-aware regularization, information neutrality regularization.
.

Evaluation and Datasets
● Offline Evaluation (Dominant Approach):
○ Static Split: Train/test split on historical data (snapshot view).
○ Dynamic/Longitudinal Split: Simulate dynamic system evolution over time.
○ Metrics: Combine accuracy metrics (NDCG, Recall) with bias-related metrics (Gini,
Coverage).
● Online Evaluation (User Studies, A/B Tests):
○ A/B tests: Deploy debiasing methods in real-world systems and measure user behavior
(clicks, engagement).
○ User studies: Gather user perceptions, subjective feedback on debiased recommendations.
○ More resource-intensive but crucial for real-world validation.
● Datasets
○ MovieLens, LastFM, BookCrossing etc. - Widely used
○ All exhibit skewed popularity distributions but vary in size, density, and bias levels.

Challenges in Addressing Popularity Bias
● Accuracy vs. fairness trade-off
○ Reducing popularity can come at a cost of user experience (especially short term)
○ Careful tuning of parameters to manage tradeoff is critical
● Defining fairness goals
○ what a fair distribution of recommendations remains unclear ?
● Measurement
○ Pre-test vs post launch inconsistency in metrics because of feedback loop based training data
● Lack of multi-stakeholder evaluation of recommender systems in A/B tests.
● Lack of measurement of long term metrics e.g retention vs short term metrics e.g clicks and watch time

References
[1] Abdollahpouri, H., Mansoury, M.: Multi-sided exposure bias in recommendation. In: Proceedings of the International Workshop on Industrial
Recommendation Systems in conjunction with ACM KDD 2020 (2020)
[2]Banerjee, A., Patro, G.K., Dietz, L.W., Chakraborty, A.: Analyzing ‘near me’ services: potential for exposure bias in location-based retrieval.
In: 2020 IEEE International Conference on Big Data, pp. 3642–3651(2020)
[3]Boratto, L., Fenu, G., Marras, M.: Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Inf.
Process. Manag. 58(1), 102387 (2021)
[4]Channamsetty, S., Ekstrand, M.D.: Recommender response to diversity and popularity bias in user profiles.In: Proceedings of the 13th
International FLAIRS Conference, pp. 657–660 (2017)
[5] Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: a survey and future directions. ACM
Trans. Inf. Syst. 31, 1–39 (2020)
[6] Deldjoo, Y., Bellogin, A., Di Noia, T.: Explaining recommender systems fairness and accuracy through the lens of data characteristics. Inf.
Process. Manag. 58(5), 102662 (2021)
[7] Yalcin, E., Bilge, A.: Investigating and counteracting popularity bias in group recommendations. Inf. Pro-cess. Manag. 58(5), 102608 (2021)
[8] Yang, Y., Huang, C., Xia, L., Huang, C., Luo, D., Lin, K.: Debiased contrastive learning for sequential recommendation. In: Proceedings of
the ACM Web Conference 2023, WWW ’23, pp. 1063–1073 (2023b)
[9] Zanon, A.L., da Rocha, L.C.D., Manzato, M.G.: Balancing the trade-off between accuracy and diversity in recommender systems with
personalized explanations based on linked open data. Knowl. Based Syst. 252, 109333 (2022)

�ݺ�ߣ

Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation

More Related Content

Similar to Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation (20)

Recently uploaded (20)

Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation