際際滷

際際滷Share a Scribd company logo
SOB 2022
Crawl capacity management
on Envato Elements
Gast坦n Riera - @gastonriera
Heads up..
際際滷s in English 鶲
Speako en Espa単ol 鶲
Gast坦n Riera - @gastonriera
At the end I'll share a 90%
discount for Elements! 
Gast坦n Riera - @gastonriera
That's how I used to look,
all well dressed and all.
Gast坦n Riera - @gastonriera
Gast坦n Riera
Everything you need to get your
creative projects done.
Gast坦n Riera - @gastonriera
The big names:
Other very cool products:
Gast坦n Riera - @gastonriera
The two things I like the most about working at envato:
- Being sustainable and caring about the community
- Fully remote (ANZ/MX) and working from abroad
Let's get into SEO 
Gast坦n Riera - @gastonriera
The problem:
A big part of the site was
not being indexed 
Gast坦n Riera - @gastonriera

Gast坦n Riera - @gastonriera
Gast坦n Riera - @gastonriera
It looked like Google was not recrawling a
large amount of pages!
Gast坦n Riera - @gastonriera
And taking very long to recrawl
some pages
Was Google actually
crawling the site?
Gast坦n Riera - @gastonriera
Gast坦n Riera - @gastonriera
Yes, it was
But, it wasn't getting to
crawl the entire site 
Gast坦n Riera - @gastonriera
And, why was that?鶲
Gast坦n Riera - @gastonriera
No idea
Gast坦n Riera - @gastonriera
Just kidding 
Gast坦n Riera - @gastonriera
We came up with two
theories to work on 
Gast坦n Riera - @gastonriera
- Content quality
- Internal linking
Gast坦n Riera - @gastonriera
We needed to work on
Gast坦n Riera - @gastonriera
We needed to work on
I'll get to them in a bit.
- Content quality
- Internal linking
What did we do? 
Gast坦n Riera - @gastonriera
The basics!
Gast坦n Riera - @gastonriera
The basics!
- Noindex
Gast坦n Riera - @gastonriera
The basics!
- Noindex
- Redirects
Gast坦n Riera - @gastonriera
The basics!
- Noindex
- Redirects
- Nofollow
Gast坦n Riera - @gastonriera
The basics!
- Noindex
- Redirects
- Nofollow
- Crawl paths (more/less)
Gast坦n Riera - @gastonriera
Gast坦n, that's nothing new!

Gast坦n Riera - @gastonriera
I know! 
Gast坦n Riera - @gastonriera
How are we using them?
Let's get to it
Gast坦n Riera - @gastonriera
Battle_1:
Content quality
Gast坦n Riera - @gastonriera
Content is not just text on the
page, but everything on it.
Every page is content.
Gast坦n Riera - @gastonriera
Battle_1: Content quality
Gast坦n Riera - @gastonriera
Two options:
1. Add content focussing on quality over
quantity.
2. Remove content from Google's index.
We already had +9M items!
Battle_1: Content quality
Gast坦n Riera - @gastonriera
Two options:
1. Add content focussing on quality over
quantity. 
2. Remove content from Google's index.
We already had +9M items!
Battle_1: Content quality
Gast坦n Riera - @gastonriera
Two options:
1. Add content focussing on quality over
quantity. 
2. Remove content from Google's index. 
We already had +9M items!
Do you know what reduces the
content quality of any site?
Gast坦n Riera - @gastonriera
Do you know what reduces the
content quality of any site?
DUPLICATE CONTENT!
Gast坦n Riera - @gastonriera
Noindex and remove duplicates,
RUTHLESSLY
Gast坦n Riera - @gastonriera
Noindex a good part of
the items library.
-> Several million less
discoverable pages!
Why we decided to
noindex instead of
a fancier solution?
Ask me later
A few tips on how to get what to noindex?
- Use google's crawled not indexed as a proxy
- Check duplicate titles/urls/content description
- Just a different image doesn't make it a different page to
the eyes of Google!
Gast坦n Riera - @gastonriera
Battle_1: Content quality
Noindex and remove duplicates,
RUTHLESSLY
Gast坦n Riera - @gastonriera
Why the redirected path
had 15% of site's traf鍖c
and 20x the destination.
Ask me later

Merged two translations that ended up being way
more similar that intended
-> A few millions pages removed from Google.
Other big things we did
 Turned Tag pages into Search pages
 Search pages are noindex by default
The overall result? Decreased the index size to a half
without impacting organic traf鍖c.
Gast坦n Riera - @gastonriera
50%
Battle_1: Content quality
Battle_2
Internal linking
Gast坦n Riera - @gastonriera
Reference
Battle_2: Internal linking
Gast坦n Riera - @gastonriera
Out of many tactics:
1. Reduce the number of crawl paths
2. Nofollow on links to low-value pages
Basically,
Be intentional and smart
about crawl paths.
Gast坦n Riera - @gastonriera
Gast坦n Riera - @gastonriera
Link to only valuable pages
Added links between related
search pages
10% Organic traf鍖c!
If it's a useful search page,
it will not have a noindex.
Note that
Gast坦n Riera - @gastonriera
Link to only valuable pages
Remove hre鍖ang when you're uncertain
of the quality on other languages
15% size of index!
hre鍖ang are bidirectional,
remove them on every
language.
Remember
What are valuable pages?
Gast坦n Riera - @gastonriera
In short, pages we want Google to index.
Gast坦n Riera - @gastonriera
Link to only valuable pages
As per nofollow:
 Nofollow on links to noindex pages
 Filters and facets, all nofollow
The overall result? Google re-crawled more pages.
60%
So, why crawl capacity
management?
Gast坦n Riera - @gastonriera
Crawl budget stayed the
same.
Gast坦n Riera - @gastonriera
*On average, over the last 2yrs.
BONUS TRACK
and unpopular opinion.
Gast坦n Riera - @gastonriera
BONUS TRACK
and unpopular opinion.
Gast坦n Riera - @gastonriera
BONUS TRACK
and unpopular opinion.
We learnt that
 Sitemaps didn't help indexing AT ALL
鶲
 Helpful only for debugging 
Gast坦n Riera - @gastonriera
1st month 1 USDコ
Go to SOB22.com
Gast坦n Riera - @gastonriera
Yeah no kidding, I did
register that domain to
share the discount
Gast坦n Riera - @gastonriera
Gracias! - Thank You!

More Related Content

Crawl Capacity Management - SEontheBeach 2022

  • 1. SOB 2022 Crawl capacity management on Envato Elements Gast坦n Riera - @gastonriera
  • 2. Heads up.. 際際滷s in English 鶲 Speako en Espa単ol 鶲 Gast坦n Riera - @gastonriera
  • 3. At the end I'll share a 90% discount for Elements! Gast坦n Riera - @gastonriera
  • 4. That's how I used to look, all well dressed and all. Gast坦n Riera - @gastonriera Gast坦n Riera
  • 5. Everything you need to get your creative projects done. Gast坦n Riera - @gastonriera The big names: Other very cool products:
  • 6. Gast坦n Riera - @gastonriera The two things I like the most about working at envato: - Being sustainable and caring about the community - Fully remote (ANZ/MX) and working from abroad
  • 7. Let's get into SEO Gast坦n Riera - @gastonriera
  • 8. The problem: A big part of the site was not being indexed Gast坦n Riera - @gastonriera
  • 9. Gast坦n Riera - @gastonriera
  • 10. Gast坦n Riera - @gastonriera It looked like Google was not recrawling a large amount of pages!
  • 11. Gast坦n Riera - @gastonriera And taking very long to recrawl some pages
  • 12. Was Google actually crawling the site? Gast坦n Riera - @gastonriera
  • 13. Gast坦n Riera - @gastonriera Yes, it was
  • 14. But, it wasn't getting to crawl the entire site Gast坦n Riera - @gastonriera
  • 15. And, why was that?鶲 Gast坦n Riera - @gastonriera
  • 16. No idea Gast坦n Riera - @gastonriera
  • 17. Just kidding Gast坦n Riera - @gastonriera
  • 18. We came up with two theories to work on Gast坦n Riera - @gastonriera
  • 19. - Content quality - Internal linking Gast坦n Riera - @gastonriera We needed to work on
  • 20. Gast坦n Riera - @gastonriera We needed to work on I'll get to them in a bit. - Content quality - Internal linking
  • 21. What did we do? Gast坦n Riera - @gastonriera
  • 22. The basics! Gast坦n Riera - @gastonriera
  • 23. The basics! - Noindex Gast坦n Riera - @gastonriera
  • 24. The basics! - Noindex - Redirects Gast坦n Riera - @gastonriera
  • 25. The basics! - Noindex - Redirects - Nofollow Gast坦n Riera - @gastonriera
  • 26. The basics! - Noindex - Redirects - Nofollow - Crawl paths (more/less) Gast坦n Riera - @gastonriera
  • 27. Gast坦n, that's nothing new! Gast坦n Riera - @gastonriera
  • 28. I know! Gast坦n Riera - @gastonriera
  • 29. How are we using them? Let's get to it Gast坦n Riera - @gastonriera
  • 31. Content is not just text on the page, but everything on it. Every page is content. Gast坦n Riera - @gastonriera
  • 32. Battle_1: Content quality Gast坦n Riera - @gastonriera Two options: 1. Add content focussing on quality over quantity. 2. Remove content from Google's index. We already had +9M items!
  • 33. Battle_1: Content quality Gast坦n Riera - @gastonriera Two options: 1. Add content focussing on quality over quantity. 2. Remove content from Google's index. We already had +9M items!
  • 34. Battle_1: Content quality Gast坦n Riera - @gastonriera Two options: 1. Add content focussing on quality over quantity. 2. Remove content from Google's index. We already had +9M items!
  • 35. Do you know what reduces the content quality of any site? Gast坦n Riera - @gastonriera
  • 36. Do you know what reduces the content quality of any site? DUPLICATE CONTENT! Gast坦n Riera - @gastonriera
  • 37. Noindex and remove duplicates, RUTHLESSLY Gast坦n Riera - @gastonriera Noindex a good part of the items library. -> Several million less discoverable pages! Why we decided to noindex instead of a fancier solution? Ask me later
  • 38. A few tips on how to get what to noindex? - Use google's crawled not indexed as a proxy - Check duplicate titles/urls/content description - Just a different image doesn't make it a different page to the eyes of Google! Gast坦n Riera - @gastonriera Battle_1: Content quality
  • 39. Noindex and remove duplicates, RUTHLESSLY Gast坦n Riera - @gastonriera Why the redirected path had 15% of site's traf鍖c and 20x the destination. Ask me later Merged two translations that ended up being way more similar that intended -> A few millions pages removed from Google.
  • 40. Other big things we did Turned Tag pages into Search pages Search pages are noindex by default The overall result? Decreased the index size to a half without impacting organic traf鍖c. Gast坦n Riera - @gastonriera 50% Battle_1: Content quality
  • 41. Battle_2 Internal linking Gast坦n Riera - @gastonriera Reference
  • 42. Battle_2: Internal linking Gast坦n Riera - @gastonriera Out of many tactics: 1. Reduce the number of crawl paths 2. Nofollow on links to low-value pages
  • 43. Basically, Be intentional and smart about crawl paths. Gast坦n Riera - @gastonriera
  • 44. Gast坦n Riera - @gastonriera Link to only valuable pages Added links between related search pages 10% Organic traf鍖c! If it's a useful search page, it will not have a noindex. Note that
  • 45. Gast坦n Riera - @gastonriera Link to only valuable pages Remove hre鍖ang when you're uncertain of the quality on other languages 15% size of index! hre鍖ang are bidirectional, remove them on every language. Remember
  • 46. What are valuable pages? Gast坦n Riera - @gastonriera In short, pages we want Google to index.
  • 47. Gast坦n Riera - @gastonriera Link to only valuable pages As per nofollow: Nofollow on links to noindex pages Filters and facets, all nofollow The overall result? Google re-crawled more pages. 60%
  • 48. So, why crawl capacity management? Gast坦n Riera - @gastonriera
  • 49. Crawl budget stayed the same. Gast坦n Riera - @gastonriera *On average, over the last 2yrs.
  • 50. BONUS TRACK and unpopular opinion. Gast坦n Riera - @gastonriera
  • 51. BONUS TRACK and unpopular opinion. Gast坦n Riera - @gastonriera
  • 52. BONUS TRACK and unpopular opinion. We learnt that Sitemaps didn't help indexing AT ALL 鶲 Helpful only for debugging Gast坦n Riera - @gastonriera
  • 53. 1st month 1 USDコ Go to SOB22.com Gast坦n Riera - @gastonriera Yeah no kidding, I did register that domain to share the discount
  • 54. Gast坦n Riera - @gastonriera Gracias! - Thank You!