ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Screaming Frog + Xpath
Crawl specific page elements on your own or
competitor‘s websites
@Urlaubspiraten #SEO4Pirates Nr.6
Sabine Langmann
Level 1
23.08.2018 Sabine Langmann Bit.ly/abjsd
What is this about?
23.08.2018 Sabine Langmann Bit.ly/abjsd
We‘d like: to crawl specific
elements on our own web pages
or the ones of our competition
We use: Screaming Frog‘s
Custom Search + XPath
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Level 2
23.08.2018 Sabine Langmann Bit.ly/abjsd
Who am I?
Geek. SEO. Kuchen.
23.08.2018 Sabine Langmann Bit.ly/abjsd
Who am I?
http://sabine-langmann.com
https://www.linkedin.com/in/sabine-langmann/
@SabTheLa
@sababeille
23.08.2018 Sabine Langmann Bit.ly/abjsd
ºÝºÝߣs available at:
/SabineLangmann
23.08.2018 Sabine Langmann Bit.ly/abjsd
Level 3Level 3
23.08.2018 Sabine Langmann Bit.ly/abjsd
Xpath
23.08.2018 Sabine Langmann Bit.ly/abjsd
XPath (XML Path Language) is a
query language
for selecting nodes
from an XML document.
Wikipedia
23.08.2018 Sabine Langmann Bit.ly/abjsd
Simple Syntax
node every page element (e.g. H2, a, p, div)
// adresses a certain node
attribute attribute of a node (e.g. class, id)
@ adresses a certain attribute
count() counts addressed nodes
23.08.2018 Sabine Langmann Bit.ly/abjsd
Simple Syntax
23.08.2018 Sabine Langmann
//node[@attribute="attribute_name"]
bit.ly/2o3vJ5O
Simple Syntax
23.08.2018 Sabine Langmann
count(//node[@attribute="attribute_name"])
bit.ly/2o3vJ5O
Level 4
23.08.2018 Sabine Langmann Bit.ly/abjsd
Some use cases
23.08.2018 Sabine Langmann Bit.ly/abjsd
Urlaubspiraten
vs.
Sonnenklar TV
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
How many images?
How many H2, H3, etc?
How many words?
How many links to which pages?
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
How many images?
How many H2, H3, etc?
How many words?
How many links to which pages?
Step #1
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #1
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #2
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #2
What am I searching for?
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #3
In
<div class="htmlContent">
I‘m searching for links, which is
<a>-Tags
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #3
Suitable Xpath selector:
//div[@class ="htmlContent"]//a
23.08.2018 Sabine Langmann Bit.ly/abjsd
3. Schritt
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #4
23.08.2018 Sabine Langmann Bit.ly/abjsd
Step #4
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
usw.23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
In
<div class="container">
I‘m searching for
the number of H3
23.08.2018 Sabine Langmann Bit.ly/abjsd
Suitable Xpath selector:
count(//div[@class ="container"]//h3)
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
Gutscheinsammler
vs.
Sparwelt
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Which star rating?
How many words in the intro?
How many active coupon codes?
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Suitable Xpath selector:
//button[@data-vouchertype="is__discounts"]//span
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
How many active coupon codes?
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Suitable Xpath selector:
count(
//div[@class="media-list vouchers-active"]
//div[@class="col-xs-12 col-sm-5"]
//span[@class="text"]
)
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
Kino.de
vs.
Serienjunkies
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
How many words in the description?
23.08.2018 Sabine Langmann Bit.ly/abjsd
Suitable Xpath selector:
//section[@class="smb-post-body"]//p
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
How many words in the description?
23.08.2018 Sabine Langmann Bit.ly/abjsd
Suitable Xpath selector:
//section//p
23.08.2018 Sabine Langmann Bit.ly/abjsd
23.08.2018 Sabine Langmann Bit.ly/abjsd
Result
23.08.2018 Sabine Langmann Bit.ly/abjsd
Level 5
23.08.2018 Sabine Langmann Bit.ly/abjsd
Recap:
Which data do I need?
Can I crawl the respective elements?
What is the right Xpath selector?
That‘s it!
23.08.2018 Sabine Langmann Bit.ly/abjsd
Screaming frog + xpath en

More Related Content

Screaming frog + xpath en