ݺߣ

ݺߣShare a Scribd company logo
How Does word2vec Work? 
Andrew Koo - Insight Data Science
word2vec (Google, 2013) 
? Use documents to train a neural network model 
maximizing the conditional probability of context given the 
word 
? Apply the trained model to each word to get its 
corresponding vector 
? Calculate the vector of sentences by averaging the vector 
of their words 
? Construct the similarity matrix between sentences 
? Use Pagerank to score the sentences in graph
1. Use documents to train a neural 
network model maximizing the conditional 
probability of context given the word 
The goal is to optimize the parameters () maximizing the 
conditional probability of context (c) given the word (w). D is the set 
of all (w, c) pairs 
For example: I ate a ???? at McDonald last night is more likely 
given Big Mac
2. Apply the model to each word 
to get its corresponding vector 
word vector 
(0.12, 0.23, 0.56) 
(0.24, 0.65, 0.72) 
(0.38, 0.42, 0.12) 
(0.57, 0.01, 0.02) 
(0.53, 0.68, 0.91) 
(0.11, 0.27, 0.45) 
(0.01, 0.05, 0.62) 
The 
Cardinals 
will 
win 
the 
world 
series
3. Calculate the vector of sentences 
by averaging the vector of their words 
word vector 
(0.12, 0.23, 0.56) 
(0.24, 0.65, 0.72) 
(0.38, 0.42, 0.12) 
(0.57, 0.01, 0.02) 
(0.53, 0.68, 0.91) 
(0.11, 0.27, 0.45) 
(0.01, 0.05, 0.62) 
The 
Cardinals 
will 
win 
the 
world 
series 
sentence vector 
(0.28, 0.33, 0.49)
4. Construct the similarity 
matrix between sentences 
1 
0.366 
0.243 
0.564 
0.720 
Sentence Vector 
S1 
S2 
S3 
S4 
S5 
0.366 
1 
0.623 
0.132 
0.189 
0.243 
0.623 
1 
0.014 
0.523 
0.564 
0.132 
0.014 
1 
0.002 
matrix * matrix.T similarity matrix 
0.720 
0.189 
0.523 
0.002 
1
5. Use Pagerank to score the 
sentences in graph 
? Rank the sentences 
with underlying 
assumption that 
summary sentences 
are similar to most 
other sentences

More Related Content

What's hot (20)

Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
irpycon
?
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
Christian Perone
?
NLP
NLPNLP
NLP
Girish Khanzode
?
Word embeddings
Word embeddingsWord embeddings
Word embeddings
Shruti kar
?
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
Shubhankar Mohan
?
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
Yuta Niki
?
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
Fwdays
?
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
?
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
?
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Toine Bogers
?
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Lviv Startup Club
?
Textrank algorithm
Textrank algorithmTextrank algorithm
Textrank algorithm
Andrew Koo
?
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
Illia Polosukhin
?
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
?
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
?
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
?
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
?
Chain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptxChain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptx
NeethaSherra1
?
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
Robert Lujo
?
Understanding GloVe
Understanding GloVeUnderstanding GloVe
Understanding GloVe
JEE HYUN PARK
?
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
irpycon
?
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
Christian Perone
?
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
Shubhankar Mohan
?
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
Yuta Niki
?
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
Fwdays
?
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
?
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
?
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Toine Bogers
?
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Lviv Startup Club
?
Textrank algorithm
Textrank algorithmTextrank algorithm
Textrank algorithm
Andrew Koo
?
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
?
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
?
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
?
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
?
Chain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptxChain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptx
NeethaSherra1
?
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
Robert Lujo
?

Similar to Word2vec algorithm (20)

Word2 vec
Word2 vecWord2 vec
Word2 vec
ankit_ppt
?
Ltc completed slides
Ltc completed slidesLtc completed slides
Ltc completed slides
Roseline Antai
?
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginner
Sungmin Yang
?
Fusing semantic data
Fusing semantic dataFusing semantic data
Fusing semantic data
Andriy Nikolov
?
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptxDL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
Kv Sagar
?
presentation2-180202073525.pptx
presentation2-180202073525.pptxpresentation2-180202073525.pptx
presentation2-180202073525.pptx
KtonNguyn2
?
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
Data Con LA
?
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Kunwoo Park
?
Week 3.pdf
Week 3.pdfWeek 3.pdf
Week 3.pdf
RupakKadhare
?
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
JIN KYU CHANG
?
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
Bayes Nets meetup London
?
AI at Stitch Fix 2017
AI at Stitch Fix 2017AI at Stitch Fix 2017
AI at Stitch Fix 2017
? Christopher Moody
?
Mining Arguments from Online Debating Systems
Mining Arguments from Online Debating SystemsMining Arguments from Online Debating Systems
Mining Arguments from Online Debating Systems
Andrea Pazienza
?
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
miaoli35
?
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems2
3ashmawy
?
Natural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdfNatural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdf
SravaniGunnu
?
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Ahmed Hani Ibrahim
?
Science in text mining
Science in text miningScience in text mining
Science in text mining
Tanay Chowdhury
?
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptx
jonathanG19
?
???? ??????
????  ??????????  ??????
???? ??????
guesta34d441
?
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginner
Sungmin Yang
?
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptxDL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
Kv Sagar
?
presentation2-180202073525.pptx
presentation2-180202073525.pptxpresentation2-180202073525.pptx
presentation2-180202073525.pptx
KtonNguyn2
?
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
Data Con LA
?
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Kunwoo Park
?
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
JIN KYU CHANG
?
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
Bayes Nets meetup London
?
Mining Arguments from Online Debating Systems
Mining Arguments from Online Debating SystemsMining Arguments from Online Debating Systems
Mining Arguments from Online Debating Systems
Andrea Pazienza
?
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
miaoli35
?
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems2
3ashmawy
?
Natural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdfNatural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdf
SravaniGunnu
?
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Ahmed Hani Ibrahim
?

Recently uploaded (20)

High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdfHigh-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
vinay salarite
?
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
SrideviPcSenthilkuma
?
diagram ANN of factor and responses.pptx
diagram ANN of factor and responses.pptxdiagram ANN of factor and responses.pptx
diagram ANN of factor and responses.pptx
EdunjobiTunde1
?
chap2_nnejjejehhehehhhhhhhhhehslides.ppt
chap2_nnejjejehhehehhhhhhhhhehslides.pptchap2_nnejjejehhehehhhhhhhhhehslides.ppt
chap2_nnejjejehhehehhhhhhhhhehslides.ppt
Nikhil620181
?
Drillingis_optimizedusingartificialneural.pptx
Drillingis_optimizedusingartificialneural.pptxDrillingis_optimizedusingartificialneural.pptx
Drillingis_optimizedusingartificialneural.pptx
singhsanjays2107
?
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
jimmy841199
?
STS-PRELIM-2025.pptxtyyfddjugggfssghghihf
STS-PRELIM-2025.pptxtyyfddjugggfssghghihfSTS-PRELIM-2025.pptxtyyfddjugggfssghghihf
STS-PRELIM-2025.pptxtyyfddjugggfssghghihf
TristanEvasco
?
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdhFOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
cshdhdhvfsbzdb
?
Introduction to Microsoft Power BI is a business analytics service
Introduction to Microsoft Power BI is a business analytics serviceIntroduction to Microsoft Power BI is a business analytics service
Introduction to Microsoft Power BI is a business analytics service
Kongu Engineering College, Perundurai, Erode
?
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEMTCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
sharmilafaller
?
flash card quizGroup B Md Hifzullah.pptx
flash card quizGroup B Md Hifzullah.pptxflash card quizGroup B Md Hifzullah.pptx
flash card quizGroup B Md Hifzullah.pptx
ReadyFor1
?
Turinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI PlatformTurinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI Platform
vikrant530668
?
CHAP-0- Lecture Overview Administration--TCPS (SS-2023)-Rev (1)--final.pdf
CHAP-0- Lecture Overview  Administration--TCPS (SS-2023)-Rev (1)--final.pdfCHAP-0- Lecture Overview  Administration--TCPS (SS-2023)-Rev (1)--final.pdf
CHAP-0- Lecture Overview Administration--TCPS (SS-2023)-Rev (1)--final.pdf
yasinalistudy
?
LITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
LITERATURE-MODEL.pptxdddddddddddddddddddddddddddddddddLITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
LITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
Maimai708843
?
Chapter-4-Plane-Wave-Propagation-pdf.pdf
Chapter-4-Plane-Wave-Propagation-pdf.pdfChapter-4-Plane-Wave-Propagation-pdf.pdf
Chapter-4-Plane-Wave-Propagation-pdf.pdf
ShamsAli42
?
Big-O notations, Algorithm and complexity analaysis
Big-O notations, Algorithm and complexity analaysisBig-O notations, Algorithm and complexity analaysis
Big-O notations, Algorithm and complexity analaysis
drsomya2019
?
Lecture 2-DATABASE MODELS lecture 2.pptx
Lecture 2-DATABASE MODELS lecture 2.pptxLecture 2-DATABASE MODELS lecture 2.pptx
Lecture 2-DATABASE MODELS lecture 2.pptx
elvis24mutura
?
Satisfaction_Framework_Presentation.pptx
Satisfaction_Framework_Presentation.pptxSatisfaction_Framework_Presentation.pptx
Satisfaction_Framework_Presentation.pptx
nagom47355
?
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
pbavila
?
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - PromptMeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
Yasen Lilov
?
High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdfHigh-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
High-Paying Data Analytics Opportunities in Jaipur and Boost Your Career.pdf
vinay salarite
?
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
SrideviPcSenthilkuma
?
diagram ANN of factor and responses.pptx
diagram ANN of factor and responses.pptxdiagram ANN of factor and responses.pptx
diagram ANN of factor and responses.pptx
EdunjobiTunde1
?
chap2_nnejjejehhehehhhhhhhhhehslides.ppt
chap2_nnejjejehhehehhhhhhhhhehslides.pptchap2_nnejjejehhehehhhhhhhhhehslides.ppt
chap2_nnejjejehhehehhhhhhhhhehslides.ppt
Nikhil620181
?
Drillingis_optimizedusingartificialneural.pptx
Drillingis_optimizedusingartificialneural.pptxDrillingis_optimizedusingartificialneural.pptx
Drillingis_optimizedusingartificialneural.pptx
singhsanjays2107
?
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
Exploratory data analysis (EDA) is used by data scientists to analyze and inv...
jimmy841199
?
STS-PRELIM-2025.pptxtyyfddjugggfssghghihf
STS-PRELIM-2025.pptxtyyfddjugggfssghghihfSTS-PRELIM-2025.pptxtyyfddjugggfssghghihf
STS-PRELIM-2025.pptxtyyfddjugggfssghghihf
TristanEvasco
?
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdhFOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
FOOD LAWS.pptxbshdhdhdhdhdhhdhdhdhdhdhhdh
cshdhdhvfsbzdb
?
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEMTCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
TCP/IP PRESENTATION BY SHARMILA FALLER FOR INFORMATION SYSTEM
sharmilafaller
?
flash card quizGroup B Md Hifzullah.pptx
flash card quizGroup B Md Hifzullah.pptxflash card quizGroup B Md Hifzullah.pptx
flash card quizGroup B Md Hifzullah.pptx
ReadyFor1
?
Turinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI PlatformTurinton Insights - Enterprise Agentic AI Platform
Turinton Insights - Enterprise Agentic AI Platform
vikrant530668
?
CHAP-0- Lecture Overview Administration--TCPS (SS-2023)-Rev (1)--final.pdf
CHAP-0- Lecture Overview  Administration--TCPS (SS-2023)-Rev (1)--final.pdfCHAP-0- Lecture Overview  Administration--TCPS (SS-2023)-Rev (1)--final.pdf
CHAP-0- Lecture Overview Administration--TCPS (SS-2023)-Rev (1)--final.pdf
yasinalistudy
?
LITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
LITERATURE-MODEL.pptxdddddddddddddddddddddddddddddddddLITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
LITERATURE-MODEL.pptxddddddddddddddddddddddddddddddddd
Maimai708843
?
Chapter-4-Plane-Wave-Propagation-pdf.pdf
Chapter-4-Plane-Wave-Propagation-pdf.pdfChapter-4-Plane-Wave-Propagation-pdf.pdf
Chapter-4-Plane-Wave-Propagation-pdf.pdf
ShamsAli42
?
Big-O notations, Algorithm and complexity analaysis
Big-O notations, Algorithm and complexity analaysisBig-O notations, Algorithm and complexity analaysis
Big-O notations, Algorithm and complexity analaysis
drsomya2019
?
Lecture 2-DATABASE MODELS lecture 2.pptx
Lecture 2-DATABASE MODELS lecture 2.pptxLecture 2-DATABASE MODELS lecture 2.pptx
Lecture 2-DATABASE MODELS lecture 2.pptx
elvis24mutura
?
Satisfaction_Framework_Presentation.pptx
Satisfaction_Framework_Presentation.pptxSatisfaction_Framework_Presentation.pptx
Satisfaction_Framework_Presentation.pptx
nagom47355
?
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
2025-02-26_PwC_Global-Compliance-Study-2025 (1).pdf
pbavila
?
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - PromptMeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
MeasureCamp Belgrade 2025 - Yasen Lilov - Past - Present - Prompt
Yasen Lilov
?

Word2vec algorithm

  • 1. How Does word2vec Work? Andrew Koo - Insight Data Science
  • 2. word2vec (Google, 2013) ? Use documents to train a neural network model maximizing the conditional probability of context given the word ? Apply the trained model to each word to get its corresponding vector ? Calculate the vector of sentences by averaging the vector of their words ? Construct the similarity matrix between sentences ? Use Pagerank to score the sentences in graph
  • 3. 1. Use documents to train a neural network model maximizing the conditional probability of context given the word The goal is to optimize the parameters () maximizing the conditional probability of context (c) given the word (w). D is the set of all (w, c) pairs For example: I ate a ???? at McDonald last night is more likely given Big Mac
  • 4. 2. Apply the model to each word to get its corresponding vector word vector (0.12, 0.23, 0.56) (0.24, 0.65, 0.72) (0.38, 0.42, 0.12) (0.57, 0.01, 0.02) (0.53, 0.68, 0.91) (0.11, 0.27, 0.45) (0.01, 0.05, 0.62) The Cardinals will win the world series
  • 5. 3. Calculate the vector of sentences by averaging the vector of their words word vector (0.12, 0.23, 0.56) (0.24, 0.65, 0.72) (0.38, 0.42, 0.12) (0.57, 0.01, 0.02) (0.53, 0.68, 0.91) (0.11, 0.27, 0.45) (0.01, 0.05, 0.62) The Cardinals will win the world series sentence vector (0.28, 0.33, 0.49)
  • 6. 4. Construct the similarity matrix between sentences 1 0.366 0.243 0.564 0.720 Sentence Vector S1 S2 S3 S4 S5 0.366 1 0.623 0.132 0.189 0.243 0.623 1 0.014 0.523 0.564 0.132 0.014 1 0.002 matrix * matrix.T similarity matrix 0.720 0.189 0.523 0.002 1
  • 7. 5. Use Pagerank to score the sentences in graph ? Rank the sentences with underlying assumption that summary sentences are similar to most other sentences