際際滷

際際滷Share a Scribd company logo
Deconstructing Popularity Bias in Recommender
Systems: Origins, Impacts, and Mitigation
Amit Jaspal
Trust & Responsibility in Recommendation Systems, WSDM 2025
Amits Introduction
 Thank you for the opportunity to speak !
 Engineering Manager and Research Scientist at Meta leading ecommerce recommendations team
 Building recommender and information systems in Industry for the last 14 years
 Ecommerce recommendations at Meta
 Video recommendations at Meta
 Ads recommendations at Meta
 Newsfeed recommendations at Linkedin
 Apace SOLR at Cloudera
 Hurricane search engine in D.E.Shaw
 Research fellow at NCSA and TDIL Labs
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
What is Popularity Bias?
 Popularity bias refers to the tendency of a recommender system to over-recommend popular items at the
expense of less popular ones. In other words, already-popular items get disproportionate exposure, while long-tail
items are under-represented.
 Not a unique problem to recommender systems, but dynamic nature of recommender system makes it worse.
 Examples of Popularity Bias in other domains
 Academic Research/Citations
 Financial Markets/Stock Trading
 Book Publishing/Best Seller Lists
 Hiring and Job Portals
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Sources of Popularity Bias in Recommender Systems
 Inherent Audience Size Imbalance (Data Bias):
 Some items are naturally more appealing to a broader audience.
 Item popularity often follows a long-tail distribution inherently.
 Even bias-free algorithms will see more interactions with these items.
 Model Bias (Algorithmic Bias):
 Machine learning models learn patterns from training data, including existing popularity biases.
 Collaborative filtering and similar methods tend to amplify popularity signals.
 Models may over-generalize from popular item interactions, leading to biased predictions.
 Closed Feedback Loop (Systemic Bias):
 Dynamic recommendation systems operate in a closed loop.
 Recommendations influence user interactions, which become training data for future models.
 This creates a feedback loop that can accumulate and exacerbate popularity bias over time.
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation
Why Does Popularity Bias Matter?
 For Users:
 Reduced novelty and serendipity - Recommendations become predictable and less engaging.
 Limited personalization - May not discover items truly aligned with individual preferences, especially niche
interests.
 Decreased user satisfaction and trust in the system over time.
 For Item Providers (Especially Long-Tail):
 Reduced visibility and sales opportunities for less popular items.
 Unfair competition - Popular items dominate, regardless of quality or relevance to specific users e.g click
baits
 Item side cold start problem.
 System-Level:
 Reinforcement loops - Bias can worsen over time due to feedback cycles.
 System behaves suboptimally catering only to popular items on one side and users who are ok with
engagement w/ only popular items.
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Measuring Popularity Bias
 Gini Coefficient
 Statistical measure of inequality within a distribution, computed using Lorenz
Curve
 Lorenz Curve is a graphical representation of inequality, showing the cumulative
distribution of a resource (e.g., wealth, recommendation exposure) across a
population.
 Gini Index can be computed as the area between the Lorenz Curve and the Line
of Equality
 Recall breakdown by item set bucket e.g recall@k for head items, recall@k for tail items
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Mitigation Strategies
 Key Mitigation Goals:
 Promote long-tail item visibility.
 Improve fairness and diversity.
 Maintain or improve recommendation accuracy (or minimize accuracy loss).
 Categorization by Processing Stage:
 Pre-processing: Modify training data before model training.
 In-processing (Modeling): Integrate debiasing directly into the model training process.
 Post-processing: Adjust recommendation lists after model prediction.
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Mitigation Strategies - Pre & Post-processing
 Pre-processing
 Data Sampling: Down-sample popular item interactions or up-sample long-tail item interactions.
 Item Exclusion: Remove highly popular items from the training data or candidate pool (use with caution).
 Balanced Dataset Creation: Aim for a more uniform distribution of item interactions in training data.
 Data Augmentation: Enrich data with side information to provide more context beyond popularity.
 Post-processing
 Re-scaling (Score Adjustment): Adjust predicted scores based on item popularity.
 Re-ranking: Re-order the initial ranked list to promote less popular items.
 Rank Aggregation / Slotting: Combine rankings from biased and debiased models.
 Post-filtering: Remove top-k popular items from the final recommendation list.
 False Positive Correction (FPC): Correct scores probabilistically based on past unclicked
recommendations
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Mitigation Strategies - In-processing (Model-Level)
 Causal Inference Methods:
 Counterfactual reasoning - Estimate recommendations without popularity influence.
 Model ranking as cause-and-effect relationship to disentangle popularity and user preference
 Reducing Memorization
 Remove ID features or add large dropouts
 Metadata based feature to improve generalization
 Re-weighting Approaches:
 Adjust item weights during training to balance popular and unpopular items.
 Inverse Propensity Scoring (IPS) - Weight items inversely proportional to their popularity.
 Regularization-based Approaches:
 Add regularization terms to the loss function to penalize popularity bias.
 Encourage models to learn from less popular items.
 Examples: Popularity-aware regularization, information neutrality regularization.
.
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Evaluation and Datasets
 Offline Evaluation (Dominant Approach):
 Static Split: Train/test split on historical data (snapshot view).
 Dynamic/Longitudinal Split: Simulate dynamic system evolution over time.
 Metrics: Combine accuracy metrics (NDCG, Recall) with bias-related metrics (Gini,
Coverage).
 Online Evaluation (User Studies, A/B Tests):
 A/B tests: Deploy debiasing methods in real-world systems and measure user behavior
(clicks, engagement).
 User studies: Gather user perceptions, subjective feedback on debiased recommendations.
 More resource-intensive but crucial for real-world validation.
 Datasets
 MovieLens, LastFM, BookCrossing etc. - Widely used
 All exhibit skewed popularity distributions but vary in size, density, and bias levels.
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Challenges in Addressing Popularity Bias
 Accuracy vs. fairness trade-off
 Reducing popularity can come at a cost of user experience (especially short term)
 Careful tuning of parameters to manage tradeoff is critical
 Defining fairness goals
 what a fair distribution of recommendations remains unclear ?
 Measurement
 Pre-test vs post launch inconsistency in metrics because of feedback loop based training data
 Lack of multi-stakeholder evaluation of recommender systems in A/B tests.
 Lack of measurement of long term metrics e.g retention vs short term metrics e.g clicks and watch time
Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
Questions and Discussions
References
[1] Abdollahpouri, H., Mansoury, M.: Multi-sided exposure bias in recommendation. In: Proceedings of the International Workshop on Industrial
Recommendation Systems in conjunction with ACM KDD 2020 (2020)
[2]Banerjee, A., Patro, G.K., Dietz, L.W., Chakraborty, A.: Analyzing near me services: potential for exposure bias in location-based retrieval.
In: 2020 IEEE International Conference on Big Data, pp. 36423651(2020)
[3]Boratto, L., Fenu, G., Marras, M.: Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Inf.
Process. Manag. 58(1), 102387 (2021)
[4]Channamsetty, S., Ekstrand, M.D.: Recommender response to diversity and popularity bias in user profiles.In: Proceedings of the 13th
International FLAIRS Conference, pp. 657660 (2017)
[5] Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: a survey and future directions. ACM
Trans. Inf. Syst. 31, 139 (2020)
[6] Deldjoo, Y., Bellogin, A., Di Noia, T.: Explaining recommender systems fairness and accuracy through the lens of data characteristics. Inf.
Process. Manag. 58(5), 102662 (2021)
[7] Yalcin, E., Bilge, A.: Investigating and counteracting popularity bias in group recommendations. Inf. Pro-cess. Manag. 58(5), 102608 (2021)
[8] Yang, Y., Huang, C., Xia, L., Huang, C., Luo, D., Lin, K.: Debiased contrastive learning for sequential recommendation. In: Proceedings of
the ACM Web Conference 2023, WWW 23, pp. 10631073 (2023b)
[9] Zanon, A.L., da Rocha, L.C.D., Manzato, M.G.: Balancing the trade-off between accuracy and diversity in recommender systems with
personalized explanations based on linked open data. Knowl. Based Syst. 252, 109333 (2022)

More Related Content

Similar to Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation (20)

PDF
Tutorial on Bias in Personalized Rankings: Concepts to Code @ ICDM 2020
Mirko Marras
PPTX
Library support for metrics: What can and should we do?
Christina Pikas
PDF
CX Strategy & Design presentation
Universal Simplexity
PPTX
Supporting safe social media practice in the AOD sector
Uniting ReGen
PPTX
Social Recommendation a Review.pptx
habibaabderrahim1
PPTX
Udacity webinar on Recommendation Systems
Axel de Romblay
PPT
Lecture 1a it and strategy
Mohammad Salman
PPT
it and strategy
Mohammad Salman
PDF
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
PPT
Impersonal Recommendation system on top of Hadoop
Kostiantyn Kudriavtsev
PDF
Jive Webcast: Gamification #201: 7Summits, Hitachi and Solarwinds presentation
7Summits
PDF
A Shift in Perspective- From Measures to Customers
Laura Orfanedes
PDF
The UX Analyst
Jainan Sankalia
PPTX
Conversion Rate Optimisation and the art of Testing
User Vision
PDF
Tutorial on Countering Bias in Personalized Rankings: From Data Engineering t...
Mirko Marras
PDF
Lecture 5: How to make the Social Web Personalized? (VU Amsterdam Social Web ...
Lora Aroyo
PDF
IT Labs Corporate Presentation
IT Labs LLC
PDF
Role of Data in Creating Consumer Awareness
UNCDF CleanStart
PPTX
Tableau Conference 2014 Presentation
krystalstjulien
PDF
Bosman and Kramer Research assessment and metadata"
National Information Standards Organization (NISO)
Tutorial on Bias in Personalized Rankings: Concepts to Code @ ICDM 2020
Mirko Marras
Library support for metrics: What can and should we do?
Christina Pikas
CX Strategy & Design presentation
Universal Simplexity
Supporting safe social media practice in the AOD sector
Uniting ReGen
Social Recommendation a Review.pptx
habibaabderrahim1
Udacity webinar on Recommendation Systems
Axel de Romblay
Lecture 1a it and strategy
Mohammad Salman
it and strategy
Mohammad Salman
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
Impersonal Recommendation system on top of Hadoop
Kostiantyn Kudriavtsev
Jive Webcast: Gamification #201: 7Summits, Hitachi and Solarwinds presentation
7Summits
A Shift in Perspective- From Measures to Customers
Laura Orfanedes
The UX Analyst
Jainan Sankalia
Conversion Rate Optimisation and the art of Testing
User Vision
Tutorial on Countering Bias in Personalized Rankings: From Data Engineering t...
Mirko Marras
Lecture 5: How to make the Social Web Personalized? (VU Amsterdam Social Web ...
Lora Aroyo
IT Labs Corporate Presentation
IT Labs LLC
Role of Data in Creating Consumer Awareness
UNCDF CleanStart
Tableau Conference 2014 Presentation
krystalstjulien
Bosman and Kramer Research assessment and metadata"
National Information Standards Organization (NISO)

Recently uploaded (20)

PPTX
AI for Empowering Women in AI
Letizia Jaccheri
PPTX
Pr辿sentation Bruit Sud-Ouest- juin 2025_TG_EN_Final.pptx
Pont Samuel-De Champlain Bridge
PDF
SZWDL denim washing and dry washing industry
te20046
PPTX
Soft Skills Training for Everybody.pp.pptx
Mayuri Srivastava
PPTX
2025-06-29 Abraham 05 (shared slides).pptx
Dale Wells
PPTX
organic farm Dr Shashi Jain 19.06.2018.pptx
Pratibha Chauhan
PDF
Performancesonore_verdun_EN.pdf
Pont Samuel-De Champlain Bridge
PPTX
Cynthia Kayle Share 5 Root Causes of Child Trafficking Around the World.pptx
Cynthia Kayle
PPTX
Accessibility isn't just for users. Creating engaging technical presentations...
Elizabeth McCready
DOC
STABILITY INDICATING METHOD DEVELOPMENT AND VALIDATION FOR SIMULTANEOUS ESTIM...
jmkeans624
PDF
models-of-communication reading and writing.pdf
TristanNabong
PPTX
To Live Is For Christ 06 29 2025.pptx
FamilyWorshipCenterD
PPTX
Pastor Bob Stewart Acts 19 06 25 2025.pptx
FamilyWorshipCenterD
DOCX
Dissertation_Antony_Musyoka.docx.for presentation
antonykamile
PPTX
From Hackathon to Real-World Impact: The Story of Sneh Vidhya Sahayog
shubhamsharma994585
PDF
Performancesonore_sudouest_EN.pdf
Pont Samuel-De Champlain Bridge
PDF
Rethinking PublicPrivate Partnerships: From Funding Gaps to Shared Goals
Francois Stepman
PDF
Amazon Wholesale Product Research Example
Joseph Juntilla
PPTX
Ludwig van Beethoven Life and Legacy.pptx
aryansnow1304
PPTX
Pr辿sentation Bruit Verdun - juin 2025_TG_EN_Final.pptx
Pont Samuel-De Champlain Bridge
AI for Empowering Women in AI
Letizia Jaccheri
Pr辿sentation Bruit Sud-Ouest- juin 2025_TG_EN_Final.pptx
Pont Samuel-De Champlain Bridge
SZWDL denim washing and dry washing industry
te20046
Soft Skills Training for Everybody.pp.pptx
Mayuri Srivastava
2025-06-29 Abraham 05 (shared slides).pptx
Dale Wells
organic farm Dr Shashi Jain 19.06.2018.pptx
Pratibha Chauhan
Performancesonore_verdun_EN.pdf
Pont Samuel-De Champlain Bridge
Cynthia Kayle Share 5 Root Causes of Child Trafficking Around the World.pptx
Cynthia Kayle
Accessibility isn't just for users. Creating engaging technical presentations...
Elizabeth McCready
STABILITY INDICATING METHOD DEVELOPMENT AND VALIDATION FOR SIMULTANEOUS ESTIM...
jmkeans624
models-of-communication reading and writing.pdf
TristanNabong
To Live Is For Christ 06 29 2025.pptx
FamilyWorshipCenterD
Pastor Bob Stewart Acts 19 06 25 2025.pptx
FamilyWorshipCenterD
Dissertation_Antony_Musyoka.docx.for presentation
antonykamile
From Hackathon to Real-World Impact: The Story of Sneh Vidhya Sahayog
shubhamsharma994585
Performancesonore_sudouest_EN.pdf
Pont Samuel-De Champlain Bridge
Rethinking PublicPrivate Partnerships: From Funding Gaps to Shared Goals
Francois Stepman
Amazon Wholesale Product Research Example
Joseph Juntilla
Ludwig van Beethoven Life and Legacy.pptx
aryansnow1304
Pr辿sentation Bruit Verdun - juin 2025_TG_EN_Final.pptx
Pont Samuel-De Champlain Bridge
Ad

Deconstructing Popularity Bias in Recommender Systems_ Origins, Impacts, and Mitigation

  • 1. Deconstructing Popularity Bias in Recommender Systems: Origins, Impacts, and Mitigation Amit Jaspal Trust & Responsibility in Recommendation Systems, WSDM 2025
  • 2. Amits Introduction Thank you for the opportunity to speak ! Engineering Manager and Research Scientist at Meta leading ecommerce recommendations team Building recommender and information systems in Industry for the last 14 years Ecommerce recommendations at Meta Video recommendations at Meta Ads recommendations at Meta Newsfeed recommendations at Linkedin Apace SOLR at Cloudera Hurricane search engine in D.E.Shaw Research fellow at NCSA and TDIL Labs Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 3. What is Popularity Bias? Popularity bias refers to the tendency of a recommender system to over-recommend popular items at the expense of less popular ones. In other words, already-popular items get disproportionate exposure, while long-tail items are under-represented. Not a unique problem to recommender systems, but dynamic nature of recommender system makes it worse. Examples of Popularity Bias in other domains Academic Research/Citations Financial Markets/Stock Trading Book Publishing/Best Seller Lists Hiring and Job Portals Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 4. Sources of Popularity Bias in Recommender Systems Inherent Audience Size Imbalance (Data Bias): Some items are naturally more appealing to a broader audience. Item popularity often follows a long-tail distribution inherently. Even bias-free algorithms will see more interactions with these items. Model Bias (Algorithmic Bias): Machine learning models learn patterns from training data, including existing popularity biases. Collaborative filtering and similar methods tend to amplify popularity signals. Models may over-generalize from popular item interactions, leading to biased predictions. Closed Feedback Loop (Systemic Bias): Dynamic recommendation systems operate in a closed loop. Recommendations influence user interactions, which become training data for future models. This creates a feedback loop that can accumulate and exacerbate popularity bias over time. Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 6. Why Does Popularity Bias Matter? For Users: Reduced novelty and serendipity - Recommendations become predictable and less engaging. Limited personalization - May not discover items truly aligned with individual preferences, especially niche interests. Decreased user satisfaction and trust in the system over time. For Item Providers (Especially Long-Tail): Reduced visibility and sales opportunities for less popular items. Unfair competition - Popular items dominate, regardless of quality or relevance to specific users e.g click baits Item side cold start problem. System-Level: Reinforcement loops - Bias can worsen over time due to feedback cycles. System behaves suboptimally catering only to popular items on one side and users who are ok with engagement w/ only popular items. Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 7. Measuring Popularity Bias Gini Coefficient Statistical measure of inequality within a distribution, computed using Lorenz Curve Lorenz Curve is a graphical representation of inequality, showing the cumulative distribution of a resource (e.g., wealth, recommendation exposure) across a population. Gini Index can be computed as the area between the Lorenz Curve and the Line of Equality Recall breakdown by item set bucket e.g recall@k for head items, recall@k for tail items Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 8. Mitigation Strategies Key Mitigation Goals: Promote long-tail item visibility. Improve fairness and diversity. Maintain or improve recommendation accuracy (or minimize accuracy loss). Categorization by Processing Stage: Pre-processing: Modify training data before model training. In-processing (Modeling): Integrate debiasing directly into the model training process. Post-processing: Adjust recommendation lists after model prediction. Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 9. Mitigation Strategies - Pre & Post-processing Pre-processing Data Sampling: Down-sample popular item interactions or up-sample long-tail item interactions. Item Exclusion: Remove highly popular items from the training data or candidate pool (use with caution). Balanced Dataset Creation: Aim for a more uniform distribution of item interactions in training data. Data Augmentation: Enrich data with side information to provide more context beyond popularity. Post-processing Re-scaling (Score Adjustment): Adjust predicted scores based on item popularity. Re-ranking: Re-order the initial ranked list to promote less popular items. Rank Aggregation / Slotting: Combine rankings from biased and debiased models. Post-filtering: Remove top-k popular items from the final recommendation list. False Positive Correction (FPC): Correct scores probabilistically based on past unclicked recommendations Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 10. Mitigation Strategies - In-processing (Model-Level) Causal Inference Methods: Counterfactual reasoning - Estimate recommendations without popularity influence. Model ranking as cause-and-effect relationship to disentangle popularity and user preference Reducing Memorization Remove ID features or add large dropouts Metadata based feature to improve generalization Re-weighting Approaches: Adjust item weights during training to balance popular and unpopular items. Inverse Propensity Scoring (IPS) - Weight items inversely proportional to their popularity. Regularization-based Approaches: Add regularization terms to the loss function to penalize popularity bias. Encourage models to learn from less popular items. Examples: Popularity-aware regularization, information neutrality regularization. . Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 11. Evaluation and Datasets Offline Evaluation (Dominant Approach): Static Split: Train/test split on historical data (snapshot view). Dynamic/Longitudinal Split: Simulate dynamic system evolution over time. Metrics: Combine accuracy metrics (NDCG, Recall) with bias-related metrics (Gini, Coverage). Online Evaluation (User Studies, A/B Tests): A/B tests: Deploy debiasing methods in real-world systems and measure user behavior (clicks, engagement). User studies: Gather user perceptions, subjective feedback on debiased recommendations. More resource-intensive but crucial for real-world validation. Datasets MovieLens, LastFM, BookCrossing etc. - Widely used All exhibit skewed popularity distributions but vary in size, density, and bias levels. Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 12. Challenges in Addressing Popularity Bias Accuracy vs. fairness trade-off Reducing popularity can come at a cost of user experience (especially short term) Careful tuning of parameters to manage tradeoff is critical Defining fairness goals what a fair distribution of recommendations remains unclear ? Measurement Pre-test vs post launch inconsistency in metrics because of feedback loop based training data Lack of multi-stakeholder evaluation of recommender systems in A/B tests. Lack of measurement of long term metrics e.g retention vs short term metrics e.g clicks and watch time Trust & Responsibility in Recommendation Systems Workshop WSDM 2025
  • 14. References [1] Abdollahpouri, H., Mansoury, M.: Multi-sided exposure bias in recommendation. In: Proceedings of the International Workshop on Industrial Recommendation Systems in conjunction with ACM KDD 2020 (2020) [2]Banerjee, A., Patro, G.K., Dietz, L.W., Chakraborty, A.: Analyzing near me services: potential for exposure bias in location-based retrieval. In: 2020 IEEE International Conference on Big Data, pp. 36423651(2020) [3]Boratto, L., Fenu, G., Marras, M.: Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Inf. Process. Manag. 58(1), 102387 (2021) [4]Channamsetty, S., Ekstrand, M.D.: Recommender response to diversity and popularity bias in user profiles.In: Proceedings of the 13th International FLAIRS Conference, pp. 657660 (2017) [5] Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: a survey and future directions. ACM Trans. Inf. Syst. 31, 139 (2020) [6] Deldjoo, Y., Bellogin, A., Di Noia, T.: Explaining recommender systems fairness and accuracy through the lens of data characteristics. Inf. Process. Manag. 58(5), 102662 (2021) [7] Yalcin, E., Bilge, A.: Investigating and counteracting popularity bias in group recommendations. Inf. Pro-cess. Manag. 58(5), 102608 (2021) [8] Yang, Y., Huang, C., Xia, L., Huang, C., Luo, D., Lin, K.: Debiased contrastive learning for sequential recommendation. In: Proceedings of the ACM Web Conference 2023, WWW 23, pp. 10631073 (2023b) [9] Zanon, A.L., da Rocha, L.C.D., Manzato, M.G.: Balancing the trade-off between accuracy and diversity in recommender systems with personalized explanations based on linked open data. Knowl. Based Syst. 252, 109333 (2022)