際際滷

際際滷Share a Scribd company logo
Text analytics using KH Coder
Some examples and principles
Stuart Palmer
Multidimensional Scaling plot of the text content (404k+ words) from 11.2k
tweets about 'safety at work' (28 Feb - 31 July 2020)
Multidimensional Scaling plot of the text content (404k+ words) from 11.2k
tweets about 'safety at work' (28 Feb - 31 July 2020)
Multidimensional Scaling plot of the text content (404k+ words) from 11.2k
tweets about 'safety at work' (28 Feb - 31 July 2020)
Multidimensional Scaling plot of the text content (404k+ words) from 11.2k
tweets about 'safety at work' (28 Feb - 31 July 2020)
KH Coder
 Collect your text
 Tidy it up (spelling?)
 Consider unit of analysis (sentence or paragraph)
 Markup if required
 Convert to plain text
KH Coder processes
- (Pre-)pre-processing
 Word extraction (stemming, parts of speech)
 Supports three methods
 Supports a stop word dictionary
KH Coder processes
- Pre-processing
奈庄壊岳温稼界艶壊
https://earth.google.com/web/
https://earth.google.com/web/
https://earth.google.com/web/
 Supports three distance measures
KH Coder processes
- Choose distance measure
Projections
(Dimensional reduction)
https://en.wikipedia.org/wiki/Winkel_tripel_projection
https://en.wikipedia.org/wiki/Mercator_projection
https://en.wikipedia.org/wiki/Mollweide_projection
https://en.wikipedia.org/wiki/Gall%E2%80%93Peters_projection
https://en.wikipedia.org/wiki/Azimuthal_equidistant_projection
https://en.wikipedia.org/wiki/Bernard_J._S._Cahill
 Supports four methods
KH Coder processes
- Choose dimensional reduction
More examples
7,074 words from the @WorkSafe_Vic Twitter account between 7 Jan 2020
to 28 Apr 2020
35,087 words that mention the @WorkSafe_Vic Twitter account between
7 Jan 2020 to 28 Apr 2020
Thank you for your time

More Related Content

Text analytics showcase_ss