際際滷

際際滷Share a Scribd company logo
London | 2024 Feb, 2012 | #seslondon


Identifying, Removing & Preventing
         Duplicate Content




                                        @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Introduction

       Personally responsible for delivery for enterprise
        level clients in travel, retail/ecommerce,
        restaurants, entertainment, etc.

       Dealt with some really large, really ugly sites
        suffering from a number of these issues




                                            @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Before we begin




                                     @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Single Site Issues




                                           @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                       Old Architecture & Hoarders*




                                                                   @SamuelCrocker
*The worst offenders
London | 2024 Feb, 2012 | #seslondon




The First Step is to Clean House




                                              @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




        Target Suspected Duplicate Content First

Known Properties




Unknown Properties




                                                          @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                         An Idealistic Approach*




  Identify Duplicate
       Content




             Step 1




*In my experience a short-midterm fix only makes
                                                                           @SamuelCrocker
these issues worse and more expensive in the LT
London | 2024 Feb, 2012 | #seslondon




                        Now The Most Critical Part



   Identify Duplicate   Create a sensible
        Content                I/A




              Step 1              Step 2




http://www.slideshare.net/a4uexpo/successful-                         @SamuelCrocker
information-architecture-richard-baxter
London | 2024 Feb, 2012 | #seslondon




                      Followed by the Delivery Part



                                                Create mapping
 Identify duplicate     Create a sensible
                                               and rewrite rules*
     Content                   I/A




            Step 1                Step 2                   Step 3




*Youre going to need a lot of time for this: which page
                                                                                          @SamuelCrocker
currently ranks, which page converts better, etc.
London | 2024 Feb, 2012 | #seslondon




                             Process is Important



                                              Create mapping             Create a process
 Identify duplicate    Create a sensible                                 and STICK TO IT*
                                              and rewrite rules
     Content                  I/A




            Step 1               Step 2                 Step 3                      Step 4




*obviously processes need to evolve, but dont forget
                                                                                       @SamuelCrocker
how quickly this can get out of control (again)
London | 2024 Feb, 2012 | #seslondon


        Thats Cool Sam, But In The Real World We Have
                            Budgets


Thats true [client], but organic traffic already accounts for [x]% of
your online revenue - and by my estimates, getting this problem
sorted could save you [x]% in PPC spend, as well as 贈x per month
in link building - and contribute an additional [贈X] in revenue... It
would also greatly improve usability

Just out of curiosity, how much money did you spend on TV and Display last year?




                                                                                @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




           Legacy & Other Orphan Pages

   Ask for a full DB     Dig through server     Ask for a proper
        dump?                   logs?            XML sitemap


                          External Links?

                   Yes                         No



301 to most relevant
page, update links as                                    Lower Priority
      possible



                                                                    @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                       PPC Landing Pages




Rel = canonical (if
actually duplicate)

NOINDEX, FOLLOW
also possible if not


                                                             @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Templated Solutions




                                       @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                     Why Are You Doing it?
PPC Landing Pages?

                                               Possible.

                                               Plausible.


                                               Not really.




                                               Debatable.



                                                             @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                              Why Are You Doing it?
       SEO Benefit?




*Im not saying you cant rank this way anymore, Im
just saying you have to work a lot harder and its not a                         @SamuelCrocker
great user experience
London | 2024 Feb, 2012 | #seslondon




                 What You Should Be Doing


Focus on the high converting               Focus on the high volume
           pages                                search queries




Create (at a minimum) some               Create (at a minimum) some
      unique content.                          unique content.



                      And if you know youre never
                      going to have unique content
                       for some pages  noindex.


                                                                           @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Product Descriptions




                                        @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




          Product Descriptions  Big(ish) Budget

Big Budget

Hire loads of in-house copywriters
Outsource quality UNIQUE content
Email newsletters for reviews (UGC)
Incentives for reviews (UGC)
Video product reviews + transcripts
     For every product
     Video Sitemaps




                                                             @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




         Product Descriptions  Smaller Budgets

Smaller Budget

   Fiverr
   Interns
   UGC
   Listen to Ralph




                                                          @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Problems from Sites You Dont Control




                                                @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




     Catch & Stop Your Competitors Scraping Your Content

Not Shady Options


    1. Create a script to automatically set-up Google Alerts for
    the first couple sentences of everything you publish.
    2. Snippet test your own content from time to time.
     3. Use rel=canonical (if scraping entire source), infrequent.
    4. Use rel=author on all internal links (can also be done in
    content)  makes more sense for bloggers.
    5. Monitor server logs for traffic spikes and carefully block IPs
     6. When all else fails, DMCA

                                                                      @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Dealing with Relatively Innocent Image Theft




                                               Oh look, link
                                                 targets!




                                                    @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Less Innocent Image Theft/Hotlinking?




                                                @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Mobile Sites




                                     @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




            The Single URL SEO Camp
Champions


                 Mobile copies of websites seem to me
                 to be more likely to cause duplicate
                 content issues, technical challenges,
                 waste engineering resources and draw
                 away attention from real mobile
                 opportunities than to earn slightly
                 higher rankings in mobile searches.
                                             - Rand Fishkin, 2011




                                                         @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                          The Mobile Site Camp
       Champions

                               In my experience, duplicate content
                               doesnt apply to the mobile paradigm.
                                                       - Bryson Meunier, 2012




                                                                       @SamuelCrocker
Source: http://bit.ly/xwHmeW
London | 2024 Feb, 2012 | #seslondon




           A Mobile Specific Experience is Preferable




    A mobile specific experience is *usually* preferable based on the data I have seen
     (conversions, rankings, keyword targeting, etc.).

                                                                                          @SamuelCrocker
Source: Morgan Stanley
London | 2024 Feb, 2012 | #seslondon




       BUT, a Separate Mobile Site Should Serve a Purpose

           Local monthly searches (mobile) Local monthly searches (desktop)
Car hire                             8100                             90500
Car rental                           4400                             33100

                         Desktop                            Mobile

                  Car hire is 2.73x more              Car hire is 1.8x more
                 searched than car rental           searched than car rental




                  Focus on car hire.             Target both terms on homepage.



In which case duplicate content shouldnt be an issue
                                                                                  @SamuelCrocker
anyhow.
London | 2024 Feb, 2012 | #seslondon




 You Cant Ignore Mobile, But Use Resource Wisely




 If you can achieve a better experience using style sheets and do not have the resource or need
  to target mobile users differently (with your content and/or UX) then a single URL solution will
  make your life easier.
                                                                                       @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Problems With International/Multi-lingual Domains




                                                       @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




     Translation & Other Issues from International Domains

   Bigger Budget


Localise  do not translate

Never machine translate

Be wary of your local market resources
(check Wikipedia)

ccTLDs for all markets (personal
preference)

Be very careful with your Geotargeting
settings in Webmaster tools

Unique servers/physical addreses
(ideally) for each country                                      @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




     Translation & Other Issues from International Domains

  Smaller Budget


Manual human translation
preferred, even if direct translation.

If you must auto-translate, block
non-unique pages with Robots.txt

Make sure you try to adhere to one
language per URL

hreflang: use with caution.
     Perhaps most appropriate for
     GB and US situations.


                                                               @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon


Problems from Controlling Multiple Similar Sites, in the
                    same niche.




                                                           @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




                 Multiple Brands/Sites, Same Niche?
  Choose a brand
voice and stick to it.




*It can be a hassle but tying this into everything                         @SamuelCrocker
you do can help ensure unique content.
London | 2024 Feb, 2012 | #seslondon




              Multiple Brands/Sites, Same Niche?
Only speak to one
audience at a time




                                                             @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




              Multiple Brands/Sites, Same Niche?
 Prioritise and go
with what converts!




                                                             @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon


Rapid Fire Survival Tips: Questions You Should
             Constantly be Asking




                                                      @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




         Why does this property exist?




Who are we trying    When do we want         What terms tie into
   to target?       to send users here?      that user journey?

                                                                @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Why did we choose this page/focus?




                       Support it with
  Know your strategy
                         numbers.


                                                 @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




           Why are we joining this new platform?




What benefit do we    What resource        Where will this content      What about problems on
     expect?         would this require?       come from?                   existing sites?



                                                                                @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon




Finally: Is it worth rocking the boat?




                                                  @SamuelCrocker
London | 2024 Feb, 2012 | #seslondon



                                                      Image Credits
   Dolly - http://www.telegraph.co.uk/science/8169817/Dolly-the-Sheep-reborn-as-four-new-clones-created.html


   Overwhelmed: http://comerecommended.com/blog/2011/09/13/dealing-with-feeling-overwhelmed-at-work/


   Single- http://www.davidwygant.com/blog/being-single-means-no-bitching-allowed/7280/


   Hoarding: http://coverlaydown.com/wp/wp-content/uploads/2011/08/messy.jpg


   Clean Office: http://www.momoy.info/uploads/interior-design/June-09/syzygy-office-02.jpg


   Pandshake: http://www.genzel.ca/wp-content/uploads/2012/02/Stephen-Harper-best-pm-ever.jpg


   Two Camps: http://www.summarynewspaper.com/high-winds-hit-the-two-camps-on-mount-everest/1735.html


   Sumo: http://www.quicksprout.com/images/littlebig.jpg


   Scrapers: http://seomemes.com/post/6314149329/scrapers-gonna-scrape


   Graffiti: http://raymondpward.typepad.com/newlegalwriter/2009/08/graffiti.html


   Pick your battles: http://www.etsy.com/listing/62656583/pick-your-battles-red-8x10-screenprint



                                                                                                                           @SamuelCrocker

More Related Content

Duplicate Content - SES London

  • 1. London | 2024 Feb, 2012 | #seslondon Identifying, Removing & Preventing Duplicate Content @SamuelCrocker
  • 2. London | 2024 Feb, 2012 | #seslondon Introduction Personally responsible for delivery for enterprise level clients in travel, retail/ecommerce, restaurants, entertainment, etc. Dealt with some really large, really ugly sites suffering from a number of these issues @SamuelCrocker
  • 3. London | 2024 Feb, 2012 | #seslondon Before we begin @SamuelCrocker
  • 4. London | 2024 Feb, 2012 | #seslondon Single Site Issues @SamuelCrocker
  • 5. London | 2024 Feb, 2012 | #seslondon Old Architecture & Hoarders* @SamuelCrocker *The worst offenders
  • 6. London | 2024 Feb, 2012 | #seslondon The First Step is to Clean House @SamuelCrocker
  • 7. London | 2024 Feb, 2012 | #seslondon Target Suspected Duplicate Content First Known Properties Unknown Properties @SamuelCrocker
  • 8. London | 2024 Feb, 2012 | #seslondon An Idealistic Approach* Identify Duplicate Content Step 1 *In my experience a short-midterm fix only makes @SamuelCrocker these issues worse and more expensive in the LT
  • 9. London | 2024 Feb, 2012 | #seslondon Now The Most Critical Part Identify Duplicate Create a sensible Content I/A Step 1 Step 2 http://www.slideshare.net/a4uexpo/successful- @SamuelCrocker information-architecture-richard-baxter
  • 10. London | 2024 Feb, 2012 | #seslondon Followed by the Delivery Part Create mapping Identify duplicate Create a sensible and rewrite rules* Content I/A Step 1 Step 2 Step 3 *Youre going to need a lot of time for this: which page @SamuelCrocker currently ranks, which page converts better, etc.
  • 11. London | 2024 Feb, 2012 | #seslondon Process is Important Create mapping Create a process Identify duplicate Create a sensible and STICK TO IT* and rewrite rules Content I/A Step 1 Step 2 Step 3 Step 4 *obviously processes need to evolve, but dont forget @SamuelCrocker how quickly this can get out of control (again)
  • 12. London | 2024 Feb, 2012 | #seslondon Thats Cool Sam, But In The Real World We Have Budgets Thats true [client], but organic traffic already accounts for [x]% of your online revenue - and by my estimates, getting this problem sorted could save you [x]% in PPC spend, as well as 贈x per month in link building - and contribute an additional [贈X] in revenue... It would also greatly improve usability Just out of curiosity, how much money did you spend on TV and Display last year? @SamuelCrocker
  • 13. London | 2024 Feb, 2012 | #seslondon Legacy & Other Orphan Pages Ask for a full DB Dig through server Ask for a proper dump? logs? XML sitemap External Links? Yes No 301 to most relevant page, update links as Lower Priority possible @SamuelCrocker
  • 14. London | 2024 Feb, 2012 | #seslondon PPC Landing Pages Rel = canonical (if actually duplicate) NOINDEX, FOLLOW also possible if not @SamuelCrocker
  • 15. London | 2024 Feb, 2012 | #seslondon Templated Solutions @SamuelCrocker
  • 16. London | 2024 Feb, 2012 | #seslondon Why Are You Doing it? PPC Landing Pages? Possible. Plausible. Not really. Debatable. @SamuelCrocker
  • 17. London | 2024 Feb, 2012 | #seslondon Why Are You Doing it? SEO Benefit? *Im not saying you cant rank this way anymore, Im just saying you have to work a lot harder and its not a @SamuelCrocker great user experience
  • 18. London | 2024 Feb, 2012 | #seslondon What You Should Be Doing Focus on the high converting Focus on the high volume pages search queries Create (at a minimum) some Create (at a minimum) some unique content. unique content. And if you know youre never going to have unique content for some pages noindex. @SamuelCrocker
  • 19. London | 2024 Feb, 2012 | #seslondon Product Descriptions @SamuelCrocker
  • 20. London | 2024 Feb, 2012 | #seslondon Product Descriptions Big(ish) Budget Big Budget Hire loads of in-house copywriters Outsource quality UNIQUE content Email newsletters for reviews (UGC) Incentives for reviews (UGC) Video product reviews + transcripts For every product Video Sitemaps @SamuelCrocker
  • 21. London | 2024 Feb, 2012 | #seslondon Product Descriptions Smaller Budgets Smaller Budget Fiverr Interns UGC Listen to Ralph @SamuelCrocker
  • 22. London | 2024 Feb, 2012 | #seslondon Problems from Sites You Dont Control @SamuelCrocker
  • 23. London | 2024 Feb, 2012 | #seslondon Catch & Stop Your Competitors Scraping Your Content Not Shady Options 1. Create a script to automatically set-up Google Alerts for the first couple sentences of everything you publish. 2. Snippet test your own content from time to time. 3. Use rel=canonical (if scraping entire source), infrequent. 4. Use rel=author on all internal links (can also be done in content) makes more sense for bloggers. 5. Monitor server logs for traffic spikes and carefully block IPs 6. When all else fails, DMCA @SamuelCrocker
  • 24. London | 2024 Feb, 2012 | #seslondon Dealing with Relatively Innocent Image Theft Oh look, link targets! @SamuelCrocker
  • 25. London | 2024 Feb, 2012 | #seslondon Less Innocent Image Theft/Hotlinking? @SamuelCrocker
  • 26. London | 2024 Feb, 2012 | #seslondon Mobile Sites @SamuelCrocker
  • 27. London | 2024 Feb, 2012 | #seslondon The Single URL SEO Camp Champions Mobile copies of websites seem to me to be more likely to cause duplicate content issues, technical challenges, waste engineering resources and draw away attention from real mobile opportunities than to earn slightly higher rankings in mobile searches. - Rand Fishkin, 2011 @SamuelCrocker
  • 28. London | 2024 Feb, 2012 | #seslondon The Mobile Site Camp Champions In my experience, duplicate content doesnt apply to the mobile paradigm. - Bryson Meunier, 2012 @SamuelCrocker Source: http://bit.ly/xwHmeW
  • 29. London | 2024 Feb, 2012 | #seslondon A Mobile Specific Experience is Preferable A mobile specific experience is *usually* preferable based on the data I have seen (conversions, rankings, keyword targeting, etc.). @SamuelCrocker Source: Morgan Stanley
  • 30. London | 2024 Feb, 2012 | #seslondon BUT, a Separate Mobile Site Should Serve a Purpose Local monthly searches (mobile) Local monthly searches (desktop) Car hire 8100 90500 Car rental 4400 33100 Desktop Mobile Car hire is 2.73x more Car hire is 1.8x more searched than car rental searched than car rental Focus on car hire. Target both terms on homepage. In which case duplicate content shouldnt be an issue @SamuelCrocker anyhow.
  • 31. London | 2024 Feb, 2012 | #seslondon You Cant Ignore Mobile, But Use Resource Wisely If you can achieve a better experience using style sheets and do not have the resource or need to target mobile users differently (with your content and/or UX) then a single URL solution will make your life easier. @SamuelCrocker
  • 32. London | 2024 Feb, 2012 | #seslondon Problems With International/Multi-lingual Domains @SamuelCrocker
  • 33. London | 2024 Feb, 2012 | #seslondon Translation & Other Issues from International Domains Bigger Budget Localise do not translate Never machine translate Be wary of your local market resources (check Wikipedia) ccTLDs for all markets (personal preference) Be very careful with your Geotargeting settings in Webmaster tools Unique servers/physical addreses (ideally) for each country @SamuelCrocker
  • 34. London | 2024 Feb, 2012 | #seslondon Translation & Other Issues from International Domains Smaller Budget Manual human translation preferred, even if direct translation. If you must auto-translate, block non-unique pages with Robots.txt Make sure you try to adhere to one language per URL hreflang: use with caution. Perhaps most appropriate for GB and US situations. @SamuelCrocker
  • 35. London | 2024 Feb, 2012 | #seslondon Problems from Controlling Multiple Similar Sites, in the same niche. @SamuelCrocker
  • 36. London | 2024 Feb, 2012 | #seslondon Multiple Brands/Sites, Same Niche? Choose a brand voice and stick to it. *It can be a hassle but tying this into everything @SamuelCrocker you do can help ensure unique content.
  • 37. London | 2024 Feb, 2012 | #seslondon Multiple Brands/Sites, Same Niche? Only speak to one audience at a time @SamuelCrocker
  • 38. London | 2024 Feb, 2012 | #seslondon Multiple Brands/Sites, Same Niche? Prioritise and go with what converts! @SamuelCrocker
  • 39. London | 2024 Feb, 2012 | #seslondon Rapid Fire Survival Tips: Questions You Should Constantly be Asking @SamuelCrocker
  • 40. London | 2024 Feb, 2012 | #seslondon Why does this property exist? Who are we trying When do we want What terms tie into to target? to send users here? that user journey? @SamuelCrocker
  • 41. London | 2024 Feb, 2012 | #seslondon Why did we choose this page/focus? Support it with Know your strategy numbers. @SamuelCrocker
  • 42. London | 2024 Feb, 2012 | #seslondon Why are we joining this new platform? What benefit do we What resource Where will this content What about problems on expect? would this require? come from? existing sites? @SamuelCrocker
  • 43. London | 2024 Feb, 2012 | #seslondon Finally: Is it worth rocking the boat? @SamuelCrocker
  • 44. London | 2024 Feb, 2012 | #seslondon Image Credits Dolly - http://www.telegraph.co.uk/science/8169817/Dolly-the-Sheep-reborn-as-four-new-clones-created.html Overwhelmed: http://comerecommended.com/blog/2011/09/13/dealing-with-feeling-overwhelmed-at-work/ Single- http://www.davidwygant.com/blog/being-single-means-no-bitching-allowed/7280/ Hoarding: http://coverlaydown.com/wp/wp-content/uploads/2011/08/messy.jpg Clean Office: http://www.momoy.info/uploads/interior-design/June-09/syzygy-office-02.jpg Pandshake: http://www.genzel.ca/wp-content/uploads/2012/02/Stephen-Harper-best-pm-ever.jpg Two Camps: http://www.summarynewspaper.com/high-winds-hit-the-two-camps-on-mount-everest/1735.html Sumo: http://www.quicksprout.com/images/littlebig.jpg Scrapers: http://seomemes.com/post/6314149329/scrapers-gonna-scrape Graffiti: http://raymondpward.typepad.com/newlegalwriter/2009/08/graffiti.html Pick your battles: http://www.etsy.com/listing/62656583/pick-your-battles-red-8x10-screenprint @SamuelCrocker