際際滷

際際滷Share a Scribd company logo
Aadish Chopra
Natural Language Processing and
Application in Video Transcripts analysis & Survey Building
Video Transcripts
There can be various ways in which Healthcare transcripts can be transcribed.
Aadish Chopra
Doctor: How are you ? (Smiles)
Doctor: How are you ?
Doctor: Hwwrru ?
Transcripts: tf-idf
Exploratory analysis via term frequency  inverse document frequency
Through this we can know what each transcripts are talking about
Word frequency vectors can be formed
Aadish Chopra
Transcripts: Bag of Words
Two approaches can be followed:
 Word  frequency
 Manual
 Open source libraries
Aadish Chopra
Merits
 Computation is less
expensive
Demerits
 Poor in situations
where context is
meaningful
Aadish Chopra
Transcripts: BOW
Open source libraries whose java implementation are available in both R and python
https://wordnet.princeton.edu/
http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/
http://www.wjh.harvard.edu/~inquirer/homecat.htm
https://medium.com/mlreview/understanding-lstm-and-its-diagrams-37e2f46f1714
Aadish Chopra
Example of Bag of Words
A look into the bag of words approach
Aadish Chopra
Type Len word Stemmed Pos priorpolarity
Strongsubject 1 acrimoniously N Anypos negative
Weaksubject 1 Active N adj Positive
Strongsubject 1 Acumen N Noun Positive
Strongsubject 1 Adamant N Adj Negative
Weaksubject 1 admission N Noun positive
Word2vec and LSTM
Word2vec approach is particularly useful to understand the
meaning of words. This technique uses context words
around the center word.
LSTM technique is resource intensive and needs a GPU,
since the essential elements are memory networks
and recursive neural networks
Aadish Chopra
Video Transcripts
What can we find out ?
 Emotions : We can suggest users what kind of video it is. If we know a users preferences, then
using the cosine similarity technique we can recommend user what type of content a video has
 Comedy, romance, action
 Context : We can tell what a video is about
 Advertisement insertion points : Googles biggest announcement was that advertisers will soon
be able to target viewers based on their Google search history, in addition to their viewing
behaviors which YouTube was already targeting.
 We can infer from Healthcare videos how the interaction is between a patient and a doctor
 Unusual events such as if we merge two ads in a video can easily be inferred
Aadish Chopra
Aadish Chopra
Survey
Problem Statement : Focus vision has fixed number of question types for a survey.
Let us suppose a customer John comes for the first time from a Healthcare category.
After the user builds the survey we can create few more questions in that category with the help of customer
John
We can recommend questions based on the similarity using the word vectors
Or if we know the category of survey we can suggest our own custom template
For example question can be in any of the following categories
Healthcare
Market Research question
Greetings
Aadish Chopra
Recommendation Engine
We will first build a repository and then using the users interaction parameters will evolve our model.
So the model might suffer from cold start problems
Aadish Chopra
Aadish Chopra
THANK YOU
Aadish Chopra

More Related Content

Focus vision

  • 1. Aadish Chopra Natural Language Processing and Application in Video Transcripts analysis & Survey Building
  • 2. Video Transcripts There can be various ways in which Healthcare transcripts can be transcribed. Aadish Chopra Doctor: How are you ? (Smiles) Doctor: How are you ? Doctor: Hwwrru ?
  • 3. Transcripts: tf-idf Exploratory analysis via term frequency inverse document frequency Through this we can know what each transcripts are talking about Word frequency vectors can be formed Aadish Chopra
  • 4. Transcripts: Bag of Words Two approaches can be followed: Word frequency Manual Open source libraries Aadish Chopra Merits Computation is less expensive Demerits Poor in situations where context is meaningful
  • 6. Transcripts: BOW Open source libraries whose java implementation are available in both R and python https://wordnet.princeton.edu/ http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/ http://www.wjh.harvard.edu/~inquirer/homecat.htm https://medium.com/mlreview/understanding-lstm-and-its-diagrams-37e2f46f1714 Aadish Chopra
  • 7. Example of Bag of Words A look into the bag of words approach Aadish Chopra Type Len word Stemmed Pos priorpolarity Strongsubject 1 acrimoniously N Anypos negative Weaksubject 1 Active N adj Positive Strongsubject 1 Acumen N Noun Positive Strongsubject 1 Adamant N Adj Negative Weaksubject 1 admission N Noun positive
  • 8. Word2vec and LSTM Word2vec approach is particularly useful to understand the meaning of words. This technique uses context words around the center word. LSTM technique is resource intensive and needs a GPU, since the essential elements are memory networks and recursive neural networks Aadish Chopra
  • 9. Video Transcripts What can we find out ? Emotions : We can suggest users what kind of video it is. If we know a users preferences, then using the cosine similarity technique we can recommend user what type of content a video has Comedy, romance, action Context : We can tell what a video is about Advertisement insertion points : Googles biggest announcement was that advertisers will soon be able to target viewers based on their Google search history, in addition to their viewing behaviors which YouTube was already targeting. We can infer from Healthcare videos how the interaction is between a patient and a doctor Unusual events such as if we merge two ads in a video can easily be inferred Aadish Chopra
  • 11. Survey Problem Statement : Focus vision has fixed number of question types for a survey. Let us suppose a customer John comes for the first time from a Healthcare category. After the user builds the survey we can create few more questions in that category with the help of customer John We can recommend questions based on the similarity using the word vectors Or if we know the category of survey we can suggest our own custom template For example question can be in any of the following categories Healthcare Market Research question Greetings Aadish Chopra
  • 12. Recommendation Engine We will first build a repository and then using the users interaction parameters will evolve our model. So the model might suffer from cold start problems Aadish Chopra