Good old John Mueller. Every time Google’s ‘Search Advocate’ opens his mouth, we all jump to attention. This week he has told us that webpages with similar URLs may not be indexed by Google.
This is pretty big. It should change the way you structure your site, especially for local optimisation.
Google doesn’t like duplicated content. It tends not to index it because it just takes up space. It is noise and doesn’t add anything to the user experience (there are some exceptions-product descriptions, for example).
Checking For Duplicate Content
Google’s algorithms have two ways of checking for duplicate content. Firstly they crawl the content on the page. If two pages on a site have different content, they both get indexed.
If two pages have the same content as one another (or with just a couple of words changed) it only indexes one. Because why would it bother to index the other?
According to John (and he would know) the algorithm looks at the URLs of the pages to decide whether or not to look for duplicated content on the actual page.
If the URLs are really similar, they assume the content is really similar or duplicated. So they don’t even crawl the copy. They just don’t index the secondary page.
The example he gives here is if you have an events website and you have lots of pages for each city near where an event is being held, Google would look at all the cities in the URL and assume the content is the same so not index one.
So for example, you have a poker event in Manchester, which can also be attended by people in Salford, Altringham, Oldham, Stockport and so on. So you might build a page for each location. Your urls might look like this
So Google would look at it and think those pages are likely to be duplicates so would only index one.
What To Do About It?
This is where your canonical tags would come in. Set a rel canonical tag on the ‘main’ page. In this instance, it would be Manchester.
That tells Google that the other pages are important and clearly linked to the canonised page.
In John’s words:
‘So what I would try to do in a case like this is to see if you have this kind of situations where you have strong overlaps of content and to try to find ways to limit that as much as possible.
And that could be by using something like a rel canonical on the page and saying, well, this small city that is right outside the big city, I’ll set the canonical to the big city because it shows exactly the same content.’
Another thing you can do is be sure the content you are creating really truly serves a purpose for your user and isn’t trying to dupe Google.
Do you really genuinely need a page for each location? Is the event or service that different? If not, you probably don’t need those extra pages and you are just creating them for murky SEO reasons.
We love to nerd out with interested parties so feel free to get in touch with us for a chat.
Email firstname.lastname@example.org, hit us up on all the socials, or give us a bell on 0161 706 0012