ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Why your Sentiment is Wrong
             A Statistical Analysis of the Sentiment Task




T.R. Fitz-Gibbon ¡ª Chief Scientist
Text Analytics Summit May, 2011
Subjectivity



    ¡°There is no true interpretation of
    anything; interpretation is a vehicle in
    the service of human comprehension.
    The value of interpretation is in enabling
    others to fruitfully think about an idea.¡±

                             - Andreas Buja
2                                         ? 2011 Networked Insights
Subjectivity


    Subject: iPhone 4 battery life, not so good?



      ¡°I would never get 7              ¡°Anyway 5 hours is
      hours just browsing               enough for me. It was
      with Wi-Fi. Not even 6            enough with 3GS.
      hours. I think my                 Maybe with next iPhone
      record has been 5                 I will have better luck
      hours 30 minutes                  with the battery.¡±
      something.¡±


3                                                      ? 2011 Networked Insights
Sentiment Analysis vs. Semantic Analysis


    Sentiment Analysis      Semantic Analysis
    Ignores most Data       Analyzes all Data


    Results Determined      Results Determined
    by Chance               by Data


    High Subjective Error   Communicates
                            Subjectivity
4                                          ? 2011 Networked Insights
The Sentiment Task




    Natural Language Processing                 Manual Analysis by Experts


                                  Sentiment
                                   Analysis


         Machine Learning                     Manual Analysis by Crowdsourcing


                Positive

                Negative

                Neutral
5                                                           ? 2011 Networked Insights
The Sentiment Task




    Natural Language Processing                 Manual Analysis by Experts


                                  Sentiment
                                   Analysis


         Machine Learning                     Manual Analysis by Crowdsourcing


                Happy
                Sad
                Indifferent
6                                                           ? 2011 Networked Insights
The Sentiment Task




    Natural Language Processing                     Manual Analysis by Experts


                                      Sentiment
                                       Analysis


         Machine Learning                         Manual Analysis by Crowdsourcing


                Intends to Purchase
                Intends to Renew
                Intends to Cancel
7                                                               ? 2011 Networked Insights
Why Sentiment Fails


Causes                          Effects
1. Narrow view of meaning       - Ignores data


2. Low statistical confidence   - Results left up to chance




8                                                ? 2011 Networked Insights
1. Why Sentiment Fails - Narrow View of Meaning

    Percentage of posts that contain sentiment



                                                 Data is based on a 500-post
                                                 sentiment study we
                                                 conducted around
                                                 smartphones. The posts were
                                                 classified by 20 people each.

                                                 Posts were assigned to a
                                                 sentiment category based
                                                 on a majority vote.

                                                 Only about 10% of posts
                                                 were found to contain
                                                 sentiment.




9                                                                ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




10                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




11                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




12                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




13                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




14                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence




15                                        ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence

Confidence intervals for a sample of four readers
                                                    When three out of four readers
                                                    agree on the sentiment of a post,
                                                    we can only be 35% confident
                                                    that a majority of all readers
                                                    would agree.

                                                    Normally, statistical significance
                                                    at the 95% level is desired (for
                                                    research and opinion polls). This
                                                    example is only statistically
                                                    significant at the 35% level.

                                                    Thus, in this case we have not
                                                    yet sampled enough readers to
                                                    concluded that we know the
                                                    sentiment of this post.




16                                                                   ? 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence


     Inter-Reader Agreement   Sample Size 95% Confidence
                       90%    < 10
                       75%    ~ 20
                       65%    ~ 50
                       55%    ~ 500




17                                                ? 2011 Networked Insights
Why Sentiment Fails - Putting it all Together
     x 10^4
                                      Expected Outcome
      Correct
                                             of
       Value
                                      Sentiment Analysis




                (Positive) Sentiment as a Percent
18                                                   ? 2011 Networked Insights
What is the Alternative



               Semantic Analysis

                Topic Discovery

                 Topic Trending


19                                 ? 2011 Networked Insights
Topic Discovery - Apple Topic Tree




20                                   ? 2011 Networked Insights
Topic Discovery - Apple Topic Tree




21                                   ? 2011 Networked Insights
Topic Trending - Apple
                  ¡±iPhone Problems" and ¡°Support" are
                  both high before ¡°Back to the Mac" and
                  quite low after


                           Support, Apple Products               Conversation becomes all
                                                                 about buying iPhones and Mac
               Problem iPhone


                                                                           Buy iPhone
     Volume




                                               Mac OS, Mac Pro




                                          Apple's big "Back to
                                          the Mac" event

      The tail end of antenna-gate, signal issues,
22    glass cracking, daylight savings bug                                      ? 2011 Networked Insights
Topic Trending - Moms and CPG

     ¡°baby, baby clothes¡±     Moms discussing fabric care   ¡°saving money¡± drives
                                                             this acceleration around
     experiences a lone                                     ¡°cloth diapers¡±
     spike around Christmas




23                                                              ? 2011 Networked Insights
Sentiment Analysis vs. Semantic Analysis


Sentiment Analysis      Semantic Analysis    Semantic Value

Ignores most Data       Analyzes all Data    Understand the Entire
                                             Conversation

Results Determined      Results Determined   Understand the Actual
by Chance               by Data              Conversation



                        Communicates         Actually Understand
High Subjective Error
                                             the Conversation
                        Subjectivity



 24                                                  ? 2011 Networked Insights
What if I need Sentiment Analysis


     3 Questions to Ask your Provider:

       ?   What is the inter-reader agreement of your manually
           scored sentiment data?

       ?   When you manually score/label posts with sentiment, to how
           many readers do you give each post?

       ?   For what type of posts was your solution designed and tested?



25                                                            ? 2011 Networked Insights
We fuel insights, helping brands and their
       agencies make better marketing decisions.




T.R. Fitz-Gibbon, Chief Scientist     Founded in 2006 by industry leaders and
608-237-1867                          seasoned entrepreneurs in the fields of social
tr.fitzgibbon@networkedinsights.com   media and customer intelligence.
                                      Headquartered in Madison, WI, with offices
networkedinsights.com                 in New York and Chicago.

More Related Content

Why Your Sentiment Is Wrong by Networked Insights

  • 1. Why your Sentiment is Wrong A Statistical Analysis of the Sentiment Task T.R. Fitz-Gibbon ¡ª Chief Scientist Text Analytics Summit May, 2011
  • 2. Subjectivity ¡°There is no true interpretation of anything; interpretation is a vehicle in the service of human comprehension. The value of interpretation is in enabling others to fruitfully think about an idea.¡± - Andreas Buja 2 ? 2011 Networked Insights
  • 3. Subjectivity Subject: iPhone 4 battery life, not so good? ¡°I would never get 7 ¡°Anyway 5 hours is hours just browsing enough for me. It was with Wi-Fi. Not even 6 enough with 3GS. hours. I think my Maybe with next iPhone record has been 5 I will have better luck hours 30 minutes with the battery.¡± something.¡± 3 ? 2011 Networked Insights
  • 4. Sentiment Analysis vs. Semantic Analysis Sentiment Analysis Semantic Analysis Ignores most Data Analyzes all Data Results Determined Results Determined by Chance by Data High Subjective Error Communicates Subjectivity 4 ? 2011 Networked Insights
  • 5. The Sentiment Task Natural Language Processing Manual Analysis by Experts Sentiment Analysis Machine Learning Manual Analysis by Crowdsourcing Positive Negative Neutral 5 ? 2011 Networked Insights
  • 6. The Sentiment Task Natural Language Processing Manual Analysis by Experts Sentiment Analysis Machine Learning Manual Analysis by Crowdsourcing Happy Sad Indifferent 6 ? 2011 Networked Insights
  • 7. The Sentiment Task Natural Language Processing Manual Analysis by Experts Sentiment Analysis Machine Learning Manual Analysis by Crowdsourcing Intends to Purchase Intends to Renew Intends to Cancel 7 ? 2011 Networked Insights
  • 8. Why Sentiment Fails Causes Effects 1. Narrow view of meaning - Ignores data 2. Low statistical confidence - Results left up to chance 8 ? 2011 Networked Insights
  • 9. 1. Why Sentiment Fails - Narrow View of Meaning Percentage of posts that contain sentiment Data is based on a 500-post sentiment study we conducted around smartphones. The posts were classified by 20 people each. Posts were assigned to a sentiment category based on a majority vote. Only about 10% of posts were found to contain sentiment. 9 ? 2011 Networked Insights
  • 10. 2. Why Sentiment Fails - Statistical Confidence 10 ? 2011 Networked Insights
  • 11. 2. Why Sentiment Fails - Statistical Confidence 11 ? 2011 Networked Insights
  • 12. 2. Why Sentiment Fails - Statistical Confidence 12 ? 2011 Networked Insights
  • 13. 2. Why Sentiment Fails - Statistical Confidence 13 ? 2011 Networked Insights
  • 14. 2. Why Sentiment Fails - Statistical Confidence 14 ? 2011 Networked Insights
  • 15. 2. Why Sentiment Fails - Statistical Confidence 15 ? 2011 Networked Insights
  • 16. 2. Why Sentiment Fails - Statistical Confidence Confidence intervals for a sample of four readers When three out of four readers agree on the sentiment of a post, we can only be 35% confident that a majority of all readers would agree. Normally, statistical significance at the 95% level is desired (for research and opinion polls). This example is only statistically significant at the 35% level. Thus, in this case we have not yet sampled enough readers to concluded that we know the sentiment of this post. 16 ? 2011 Networked Insights
  • 17. 2. Why Sentiment Fails - Statistical Confidence Inter-Reader Agreement Sample Size 95% Confidence 90% < 10 75% ~ 20 65% ~ 50 55% ~ 500 17 ? 2011 Networked Insights
  • 18. Why Sentiment Fails - Putting it all Together x 10^4 Expected Outcome Correct of Value Sentiment Analysis (Positive) Sentiment as a Percent 18 ? 2011 Networked Insights
  • 19. What is the Alternative Semantic Analysis Topic Discovery Topic Trending 19 ? 2011 Networked Insights
  • 20. Topic Discovery - Apple Topic Tree 20 ? 2011 Networked Insights
  • 21. Topic Discovery - Apple Topic Tree 21 ? 2011 Networked Insights
  • 22. Topic Trending - Apple ¡±iPhone Problems" and ¡°Support" are both high before ¡°Back to the Mac" and quite low after Support, Apple Products Conversation becomes all about buying iPhones and Mac Problem iPhone Buy iPhone Volume Mac OS, Mac Pro Apple's big "Back to the Mac" event The tail end of antenna-gate, signal issues, 22 glass cracking, daylight savings bug ? 2011 Networked Insights
  • 23. Topic Trending - Moms and CPG ¡°baby, baby clothes¡± Moms discussing fabric care ¡°saving money¡± drives this acceleration around experiences a lone ¡°cloth diapers¡± spike around Christmas 23 ? 2011 Networked Insights
  • 24. Sentiment Analysis vs. Semantic Analysis Sentiment Analysis Semantic Analysis Semantic Value Ignores most Data Analyzes all Data Understand the Entire Conversation Results Determined Results Determined Understand the Actual by Chance by Data Conversation Communicates Actually Understand High Subjective Error the Conversation Subjectivity 24 ? 2011 Networked Insights
  • 25. What if I need Sentiment Analysis 3 Questions to Ask your Provider: ? What is the inter-reader agreement of your manually scored sentiment data? ? When you manually score/label posts with sentiment, to how many readers do you give each post? ? For what type of posts was your solution designed and tested? 25 ? 2011 Networked Insights
  • 26. We fuel insights, helping brands and their agencies make better marketing decisions. T.R. Fitz-Gibbon, Chief Scientist Founded in 2006 by industry leaders and 608-237-1867 seasoned entrepreneurs in the fields of social tr.fitzgibbon@networkedinsights.com media and customer intelligence. Headquartered in Madison, WI, with offices networkedinsights.com in New York and Chicago.