Tags:

How to avoid duplicate content in Magento?

In the field of search engine optimization (SEO) everyone tries to write the content in best quality from a customer perspective and ultimately also for the search engine. One important aspect therein is to avoid duplicate content as this is a major problem for search engines.

In this post I will clarify what duplicate content is and show a common source of duplicate content in Magento.

Chris / / last updated on

In the field of search engine optimization (SEO) everyone tries to write the content in best quality from a customer perspective and ultimately also for the search engine. One important aspect therein is to avoid duplicate content as this is a major problem for search engines.

In this post I will clarify what duplicate content is and show a common source of duplicate content in Magento.

What is Duplicate Content?

Duplicate content means publishing more or less the same content under different URLs. For example if you have written a blog post about how browsers handle cookies and you publish it under the URL /2020/09/browsers-restrict-cookies-2020/ and because of some automatic actions by your content management systems the post is also published under /browsers-restrict-cookies-2020/, then this probably might be duplicate content.
I write "probably" because there are ways to tell the search engine robot that one of both pages is the "master page", e.g. via Canonical Links.

A very useful explanation of duplicate content strategies can be found at Yoast, which are very much the experts in SEO for WordPress.
But of course also Google is posting their view on this.

How does Duplicate Content affect Magento?

But back to sources of duplicate content in Magento:
In general for most of my clients the Magento configuration option System -> Configuration -> Web -> URL Options -> Add store code to urls is activated. This means that you can have different Store(-view)s for different countries or languages and have the German store available under the URL www.store.test/de and the English one at www.store.test/en.

The problem here is that unfortunately in Magento, CMS pages then are accessible under 2 different urls. Let’s assume we have a CMS page activated for "DE" store with URL identifier test-cms-page and we have activated the store code in the configuration options mentioned above, then this CMS page with the German content is available at:

  • https://www.store.test/de/test-cms-page
  • https://www.store.test/test-cms-page

Obviously this is a real problem for duplicate content in Magento!

The solution for Duplicate Content in Magento 1

But the fix is also very easy and I have implemented it in my Vianetz_Utilities extension for Magento 1:

final class Vianetz_Utilities_Model_CmsNoRouteObserver
{
    /**
     * Event: cms_controller_router_match_before 
     */
    public function run(Varien_Event_Observer $observer)
    {
        $condition = $observer->getEvent()->getData('condition');

        if ($this->getRequest()->getActionName() === 'noRoute') {
            $condition->setData('continue', false);
        }
    }

    /**
     * @return \Mage_Core_Controller_Request_Http|\Zend_Controller_Request_Http
     */
    private function getRequest()
    {
        return Mage::app()->getRequest();
    }
}

Therefore we implement a new observer and check if the route action name is "noRoute". In this case we either redirect to the url with the store code or simply show a 404 (which is what the extension does).

Same page, but different languages

Let's extend this problem a bit and have a look at what would happen if we enable the new CMS page also for the "EN" store. Basically in this case we now have 2 different urls:

  • https://www.store.test/de/test-cms-page
  • https://www.store.test/en/test-cms-page

Let's assume now that the pages do not differ that much because we are lazy and have only translated some elements on the page like navigation and footer and the other elements on the page are identical. That's again a classical case for duplicate content in Magento.

To avoid this, all we need to do is to choose the "main" page, i.e. the url for the main language that the page is translated to and set a canonical link to it. This can be easily achieved in the Design -> Layout -> XML for layout changes section of the CMS page itself.
If the CMS page is mainly written in English then we can set the canonical on our "DE" page like this:

<reference name="head">
    <action method="addLinkRel">
        <rel>canonical</rel>
        <href>https://www.store.test/en/test-cms-page</href>
    </action>
</reference>

I hope this helps you to avoid some common sources of duplicate content in Magento. Let me know in the comments below!


Post Comments to "How to avoid duplicate content in Magento?"

Submit Comment

With the use of this comment form you agree to the saving and processing of your data by this website. More information about the processing of your data can be found in our privacy statement.
Your data will be transmitted securely via SSL.

Meine Magento Extension Bestseller