This talk took place during Engineering and Tech Exhibition organised by University of Bristol on 9th October 2019.
In this talk, Rad Paluszak from SUSO Digital will show you how one of the biggest and the most sophisticated companies in the world (Google) is using machine learning in their flagship product - Search. The talk will give you some background to how Google's algorithm works from a web developer and SEO perspective and what elements of it are fluently moving towards being driven by machine learning and neural matching. You will also learn how Google dynamically creates training data sets at scale and ensures they're reliable.
1 of 42
More Related Content
Machine Learning in Google Algorithm - Where? What? How?
2. Machine Learning in Google Algorithm
? Machine Learning
? Deep Learning
? Hummingbird
? RankBrain
? Natural Language Processing & Understanding
? Neural Matching
? Crawling and Indexing
? Penalty Process
? Large Scale Training Data Sets
? Summary
What am I talking about?
Machine Learning in Google Algorithm: Where? What? How?
3. Rad Paluszak
? “SEO” birthday - 2010 (Caffeine update)
? Web developer “at heart”
? Algorithms <3
? Machine Learning <3
? Data Mining <3
? “Technical SEO Artist”
@radpaluszak
? rad@paluszak.me
? rad@susodigital.com
Director of Technology
4. Machine Learning
Machine learning is a method of data analysis that
automates analytical model building. It is a branch of
artificial intelligence based on the idea that systems can
learn from data, identify patterns and make decisions with
minimal human intervention.
5. Machine Learning
Machine learning is a technology which, instead of
programming the computers in a very precise manner,
allows them to perform tasks based on what they learn
through data analysis.
6. Deep Learning
Deep learning is part of a broader family of machine learning
methods based on learning data representations (e.g. ontology), as
opposed to task-specific algorithms.
A deep neural network (DNN) is an artificial neural network (ANN)
with multiple layers between the input and output layers. The DNN
finds the correct mathematical manipulation to turn the input into the
output, whether it be a linear relationship or a non-linear
relationship. The network moves through the layers calculating the
probability of each output. ByGlosser.ca[CCBY-SA3.0 (https://creativecommons.org/licenses/by-sa/3.0)],fromWikimediaCommons
https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg
7. Hummingbird
Codename given to a significant algorithm
change in Google Search in 2013.
Its name was derived from the speed and
accuracy of the hummingbird
Conversational Search
Natural Language Processing
Query Intent
Semantic Model Analysis
New Ranking Signals
Importance of Authority
Long-tail Focused
17. Google Search Results
? Over 70k searches every SECOND
? 15% of queries have never been seen
by Google
? ~500 million brand new queries a
day
https://www.cnet.com/news/google-search-scratches-its-brain-500-million-times-a-day/
18. RankBrain
RankBrain is an artificial intelligence system, the use
of which was confirmed by Google on 26 October
2015. It helps Google to process search results and
provide more relevant search results for users.
If RankBrain sees a word or phrase it isn’t familiar
with, the machine can make a guess as to what words
or phrases might have a similar meaning and filter the
results, accordingly, making it more effective at
handling never-before-seen search queries or
keywords.
19. RankBrain
Understands similarity of the
queries based on multi-
dimensional vector space
analysis & the proximity of
one query to the other.
20. Key Value
First Name [Rad, Radek, Radoslaw]
Last Name [Paluszak]
Gender [M]
Profession [Google, SEO, CTO, Programming,
Web, Webdev, …]
Company [TSI, SUSO Digital, Search
Logistics, The Search Initiative …]
Events [Engineering and Tech Exhibition,
Chiang Mai SEO, Affilliate World,
Marketing Insights]
Universities
Associated
[Poznan University of Technology,
Warwick Business School,
University of Bristol]
Related People [Matt Diggity, Craig Campbell,
Matthew Woodward, …]
… …
Key Value
Name [Engineering and Tech Exhibition]
Organiser [University of Bristol, Lucy
Browning]
Address [Colston Hall, Colston St, Bristol,
BS1 5AR]
Talk Title [Web, Machine Learning, Google,
SEO, Marketing, Algorithms,
Programming]
Related Topics [SEO Ecommerce, Linkbuilding]
Related Entity [SUSO Digital, Google, University
of Bristol]
Coordinates [51° 27' 21.096‘’ N,
2° 35' 51.18‘’ W]
… …
21. Key Value
First Name [Rad, Radek, Radoslaw]
Last Name [Paluszak]
Gender [M]
Work-related
Entities
[Google, SEO, CTO,
Programming, Web, Webdev, …]
Company [TSI, SUSO Digital, Search
Logistics, The Search Initiative …]
Events [Engineering and Tech
Exhibition, Chiang Mai SEO,
Affilliate World, Marketing
Insights]
Universities
Associated
[Poznan University of Technology,
Warwick Business School,
University of Bristol]
Related People [Matt Diggity, Craig Campbell,
Matthew Woodward, …]
… …
Key Value
Name [Engineering and Tech
Exhibition]
Organiser [University of Bristol, Lucy
Browning]
Address [Colston Hall, Colston St, Bristol,
BS1 5AR]
Talk Topics [Web, Machine Learning,
Google, SEO, Marketing,
Algorithms, Programming]
Related Topics [SEO Ecommerce, Linkbuilding]
Related Entity [SUSO Digital, Google, University
of Bristol]
Coordinates [51° 27' 21.096‘’ N,
2° 35' 51.18‘’ W]
… …
22. Rank Brain in Search Results
User Query
Results Satisfaction
Analysis (CTR, BR)
Results
Post
process
ing
Relevance
Matching
Intent
AnalysisNLPQuery
Parsing
29. Neural Matching
Neural Matching – AI method to better
connect words to concepts.
Introduced in 2018 (officially confirmed in
September 2018).
Affects ~30% of all queries.
31. “Google always tries to predict
your site structure and assess
what’s worth crawling and what
is not.”
34. I didn’t learn anything
new about the site
structure.
GooglebotImageCredits:
http://www.thesempost.com/blocking-googlebot-with-bad-bot-scripts-wordfence/
https://www.seroundtable.com/google-crawl-report-problem-19894.html
Schedulerforsearchenginecrawler
https://patents.google.com/patent/US7725452B1/en
35. “Manual Penalty process starts
with a classification of
suspicious patterns detected by
machine learning algorithms.”
38. Training Data Set – Search Quality Evaluators
https://static.googleusercontent.com/media/guidelines.raterhub.com/en//searchqualityevaluatorguidelines.pdf
41. Thank You!
RAD PALUSZAK
Director of Technology at SUSO Digital
rad@susodigital.com
rad@paluszak.me
@radpaluszak
Email Subject:
Bristol Machine Learning Talk 2019