ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Internetlivestats.com
Coreference resolution
Question answering (QA)
Part-of-speech (POS) tagging
Word sense disambiguation (WSD)
Paraphrase
Named entity recognition (NER)
Parsing
Summarization
Information extraction (IE)
Machine translation (MT)
Dialog
Sentiment analysis
mostly solved
making good progress
still really hard
Spam detection (Classification)
Let¡¯s go to Agra!
Buy V1AGRA ¡­
?
?
Colorless green ideas sleep furiously.
ADJ ADJ NOUN VERB ADV
Einstein met with UN officials in Princeton
PERSON ORG LOC
You¡¯re invited to our dinner
party, Friday May 27 at 8:30
Party
May 27
add
Best roast chicken in San Francisco!
The waiter ignored us for 20 minutes.
Carter told Mubarak he shouldn¡¯t run again.
I need new batteries for my mouse.
The 13th Shanghai International Film Festival¡­
µÚ13½ìÉϺ£¹ú¼ÊµçÓ°½Ú¿ªÄ»¡­
The Dow Jones is up
Housing prices rose
Economy is
good
Q. How effective is ibuprofen in reducing
fever in patients with acute febrile illness?
I can see Alcatraz from the window!
XYZ acquired ABC yesterday
ABC has been taken over by XYZ
Where is Citizen Kane playing in SF?
Castro Theatre at 7:30. Do
you want a ticket?
The S&P500 jumped
Source: Dan Jurafsky
non-standard English
Great job @justinbieber! Were
SOO PROUD of what youve
accomplished! U taught us 2
#neversaynever & you yourself
should never give up either?
segmentation issues idioms
dark horse
get cold feet
lose face
throw in the towel
neologisms
unfriend
Retweet
bromance
tricky entity names
Where is A Bug¡¯s Life playing ¡­
Let It Be was recorded ¡­
¡­ a mutation on the for gene ¡­
the New York-New Haven Railroad
the New York-New Haven Railroad
Source: Dan Jurafsky (modified)
sarcasm
A: I love Justin Bieber. Do you
like him to?
B:Yeah. Sure. I absolutely love
him.
http://www.alchemyapi.com/
https://www.congress.gov/resources/display/content/The+Federalist+Papers#TheFederalistP
apers-10
1:10pm
Text Mining with R for Social Science Research
Non-Stop
Adair Moesteller &Wallace Fung Collins et al
Corpus
Document
Term
Source:
Chris Manning
Tokenize Clean Stem Filter
Then a hurricane came, and devastation reigned
then a hurricane came and devastation reigned
then a hurricane came and devastation reigned
then a hurricane came and devastation reigned
GitHub site
1:20pm Code Lines: 1 - 49
Code Lines: 50-79
Federalist Paper 1: Before Federalist Paper 1: After
Code Lines: 71-88
Federalist Paper 1: After
Code Lines: 89-104
Code Lines: 142-149
Code Lines: 151-1651:30pm
Code Lines: 167-171
Code Lines: 173-188
Code Lines: 189-201
Code Lines: 202-207
Uncomment (CTRL + SHIFT +C) and run lines 107-139
Code Lines: 107-139
then rerun lines
141-206
1:50pm - 2pm
Text Mining with R for Social Science Research
BayesTheorem
these slides
Text Mining with R for Social Science Research
Code Lines: 208-219
Update
Code Lines: 231-241
Code Lines: 242-248
Code Lines: 250-273
Code Lines: 275-290
This will take about 4 mins, depending on the computer you run it on
Code Lines: 295-308
Source: David Blei (link to article)
Code Lines: 295-308
Index.html file in the ¡°Federalist¡± folder in your working directory.
Open with FireFox; it is not supported by Chrome or IE.
Code Lines: 321-349
Text Mining with R for Social Science Research
Code Lines: 350-370
? Na?ve Bayes predicts 9 of the 12 papers
as written by Madison.
? K-NN predicts only 4 of the 12 papers
as written by Madison
? Why? How stable are these results??
Code Lines: 371-373
2:30pm
Source: Richard Heimann
Source: Richard Heimann
Source: Richard Heimann
The Beige Book
GitHub
Source: Richard Heimann
https://github.com/wesslen/BeigeBookSentimentAnalysis
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
First six records of BB.sentiment
First six records of BB.sentiment (updated)
Raw Scored Sentiment Scaled Scored Sentiment
Text Mining with R for Social Science Research
Text Mining with R for Social Science Research
Stanford Deep Learning NLP class materials
https://projectmosaic.uncc.edu/events-list/
GNIP access
http://www.r-
bloggers.com/setting-up-the-twitter-r-package-for-text-analytics/
AlchemyAPI
Taste Analytics Signals
SAS Enterprise Miner
SAS Sentiment Analysis
Hamilton Soundtrack Amazon Reviews
R tm package
Python nltk package
Python gensim package
Mallet
IntroductoryText MiningClass
Coursera Natural
Language ProcessingClass
CourseraText
Mining & Analytics Course
Deep Learning for Natural Language
Processing
https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-
1-for-beginners-bag-of-words
http://www.alchemyapi.com/developers/getting-started-
guide/twitter-sentiment-analysis
https://eight2late.wordpress.com/2015/09/29/a-gentle-
introduction-to-topic-modeling-using-r/
http://www.r-bloggers.com/sentiment-analysis-on-donald-
trump-using-r-and-tableau/
Follow this link for all R ¡°text¡± blogs on Rbloggers
website

More Related Content

Text Mining with R for Social Science Research

Editor's Notes

  • #2: Three questions: Experience with R
  • #12: Set the bedrock for the united states government¡­ if you want to know the original structure of the us government, this is your document.
  • #17: https://eight2late.wordpress.com/2015/05/27/a-gentle-introduction-to-text-mining-using-r/