This project is about designing a system that can help detect insults on social media platforms. It also involves creating a library of words containing insults to help organizations, social media platforms, CEO/managers, bloggers and so on to filter unwanted or insulting comments from social commentary.
1 of 26
More Related Content
Detecting insults in social media conversations
1. Text Mining:
Detecting Insults in Social
Media Conversations
Group 7:
Dinesh Reddy Srirangapalle
Oluwafunke Balogun
Rajsingh Rathore
Firdous Farooque Shaikh
Aditya Trivedi
2. Introduction
???Social media has become a very powerful platform
???There are 1.8 Billion people worldwide who use some form of social
media
???73% of US population uses social media
???With the growth of mobile devices and reduction in their costs more
and more people are joining these platforms. The number of active
social media users is growing everyday
???Now social media not only plays a big role in people’s personal lives
but also in their professional lives
3. Pros and cons of social media
Pros Cons
??? Social media helps people connect
and keep in touch.
??? It also helps people stay updated
with current topics and news.
??? It helps people discover new trends
and topics.
??? It helps organizations better
understand their customers.
??? Giving freedom to people to say
anything to anyone with less or no
adverse affects.
??? Has been proven to have negative
effects on small children like social
awkwardness and inability to
connect with real people.
??? Another serious concern is
cyberbullying.
4. Problem Statement
With the growth in number of social media users, we have seen a growth in
crimes against kids and the number of cyberbullying cases.
According to cyberbullying.org (http://cyberbullying.org/facts/):
??? In the U.S. average kids start using internet as early as the age of 11-12
??? 36% of the parents take no action to limit or monitor their kids activities on
the internet
??? 25% student in high school have encountered cyberbullying at some point in
their lives
As of result of this we are seeing an increase in:
??? Kids dropping out of school
??? Lower performance in school
??? Disorders like anxiety and depression
In some extreme cases some have resorted to suicide while fighting
cyberbullying.
5. Purpose
We know that social media plays an important role in our society.
The purpose of our project is to:
??? MONITOR insults
??? PREVENT insults
At the same time maintaining:
??? FREEDOM
??? AND FLEXIBILITY
on the social media platforms
The project also tries to include in the system the “Slang” or
unconventional types of insults which are usually hard to
detect and are not found in standard English dictionary.
6. Significance of the project
??? In the last couple of decades, presence of internet has changed our
society forever
??? There are still not many strict laws and rules in place when it comes
to the limitations on what people can do and say on social media
platforms
??? There is a need for a mechanism which monitors and blocks insults
on social media
??? Our project tries to provide a way with the help of which social
media can be what it was meant to be and not what some rude and
mean people have made it to be
7. The design problem and scope for the
project
Scope:
???To build a system that can detect whether a comment is insulting or not
???Also to design a machine learning system that can classify online posts and
comments as insulting or not
The design problem of our project is:
Analysis and design consideration of a system to detect insults in social
media conversations.
8. Assumptions and Limitations of the
project
Assumptions:
??? Assumptions related to data set
??? Assumptions related to tools that we are using
Limitations:
Our project limited to:
??? Limitations of the tool
??? Limitations of the data set
??? Limited use of Semantria because of the trial version
??? Knowledge and skill of the team
9. Literature Review
Data Analytics:
Science of examining raw data to draw conclusions which helps to take
better decisions
Social Media Analytics:
Analyzing data collected from blogs and social media websites for
organizational or social benefits
Reputation System:
Computes and publish reputation scores for a set of objects
with in a domain or community
Text Mining:
Similar to “text analysis”. Helps to derive high quality
information from text. It uses patterns and trends to analyze
10. Systems Development & Methodology
??? Get the Dataset
??? Understand the Dataset
Variables Description Role Level
Insult 1- Insult
0- Non-Insult
Target Unary
Date Time at which the
comment was made
Rejected Interval
Comment Social Conversation Input Text
11. Analysis of the Dataset:
Analysis of
the Dataset
SAS E-Miner
Semantria
for Excel
12. Semantria for Excel:
??? Semantria is an Excel add-on that we have used to carry out sentiment analysis,
categorization, easy customization, entity extraction and visualization
Semantria
for Excel
Detailed
Mode
Discovery
Mode
13. Results & Discussions of Semantria Analysis
Facet Table:
??? A combination of both Facets and Attributes are taken into consideration to build a system of
insults
14. Themes Breakdown:
??? The above extracted text and graphical representation represents the various themes insults
have been grouped in
15. Entity Sentiment Breakdown:
??? Entity Sentiment Breakdown chart has extracted various highly mentioned proper nouns from
the social conversations with specific sentiment score assigned to them
16. Category Breakdown:
??? Based on the ‘Relevancy Score’ technique used by the tool, we get a Category Breakdown
which is a segmentation of insults into various categories
17. Queries Breakdown:
??? Query Breakdown is based on segmentation of insults which are categorized on the basis of
Boolean Operators
24. Recommendation
??? Large social media platforms should carry out in-house analysis to help add to the
library of words and implement a benchmark for a other small social media platform
to follow suit
??? CIO's should tap into the usefulness and uses of social media and learn what tool best
works for their organization environment to carry out in-house analysis
??? Further research can be conducted on these topics with the use of a larger dataset to
allow for more accuracy and insights
25. Conclusion
??? Our analysis proves to show the need for more sophisticated tools and expertise in
other to carry out text mining to achieve desired results.
??? Text mining is still in its elementary phase in analysis and is still gaining acceptance
??? Vendors are also trying to innovate more user friendly text mining tools while also
embedding text mining into data mining tool. The likes of SAS have already done this
??? The analysis of our project is restricted to the data set we have. We did not aim at
creating a library but instead to show ways on how word parsing and filtering can be
done in relation to text mining