Post by emdadul12 on Nov 11, 2024 20:45:57 GMT 10
f Magento sites, we can see that, by default, proper canonical tags are not set. In Magento, all of the paginated URLs in a given series have a canonical tag that points back to the root category page. For example, here is how the canonical tag of “Page 2” of a particular category would look:
Technically, this is not best practice from an SEO Seznam e-mailů B2B standpoint. Canonical tags should only be used to consolidate duplicate content. Since paginated content are not duplicates of the root versions (as they contain different products), they should not have canonical tags that point to this version. Instead, every page within the pagination series should have it’s own self-referential canonical tag. This will tell Google that the paginated URL contains unique content and should be crawled accordingly.
You might need to have a developer create a custom solution that allows the site’s pagination to utilize self-referential canonical tags instead of pointing to the root category page.
Indexable internal search pages
Another Magento SEO issue is that internal search pages are indexable out of the box. This means that Google can crawl and index these low-quality pages. These pages will generally be in the /catalogsearch/ URL path.
For example, here’s a Magento site where over 4,000 internal search pages have gotten caught in Google’s index:
Internal search pages indexed in Google search results.
In order to ensure that these pages don’t get indexed by Google, you’ll want to be sure the “noindex” tag is applied to them. We recommend having a developer implement this for you and providing this article as a reference point for them.
After you’ve implemented the “noindex” tag, you’ll want to be sure that none of your internal search URLs are actually getting indexed. Perform a search for “site:example.com inurl:/catalogsearch/”. If you see URLs appearing in the index, we recommend waiting until Google removes the majority of them. If you don’t see the URLs in the index, you might consider blocking them by using a robots.txt command.
Robots.txt
Within Magento, you can also configure the robots.txt file. You’ll want to utilize the robots.txt file in order to limit how many pages of your Magento site that Google is eligible to crawl. This is especially important to configure if your site utilizes a faceted navigation that allows users to select from a variety of attributes.
Fortunately, Magento does allow you to control the robots.txt of your site. To do this, you can perform the following steps:
In the Admin sidebar, navigate to Content > Design > Configuration
Find the “Store View” you want to adjust and select “Edit”
Expand the “Search Engine Robots” dropdown
Add your robots.txt commands in the “Edit custom instruction of robots.txt File” field
How you adjust the robots.txt is going to depend on your particular store. Unfortunately, there is no one-size-fits-all option here. The main objective will be to block the crawling of any low value pages (that aren’t indexed) while allowing the crawl of high priority ones.
Below are some general things you might consider blocking in the robots.txt:
Low value pages created by the faceted navigation and sorting options
The site’s internal search pages
Login pages
The user’s shopping cart
Sitemap.xml
Sitemap.xml files ensure that Google has a pathway of discovering all of your site’s key URLs. This means that regardless of the site’s architecture, the sitemap.xml gives Google a way of finding important URLs on the site.
Fortunately, Magento has the capability of creating a sitemap.xml file and does a good job of this in it’s default settings. You can technically configure the XML sitemap settings in Magento’s “Catalog” menu. However, most of these should be okay.
While these settings are configured, you might need to generate your sitemap.xml file so it will actually be published on the site. Fortunately, that process is very straightforward. You can do this by:
Navigating to Marketing > SEO & Search > Site Map
Click the “Add Sitemap” button
For “Filename” add the text “sitemap.xml”
For “Path”, choose the URL path you want to be associated with your sitemap.xml file. This is generally at the “/pub/” URL path
Click “Save & Generate”
Setting up a sitemap.xml on Magento.
This should correctly set up your sitemap.xml on Magento. You’ll then want to be sure to submit your sitemap.xml file to Google Search Console so Google can discover your sitemap.xml file.
2. JavaScript rendering
Something else that you’ll want to be mindful of on Magento sites is any content that is loaded through JavaScript. Magento frequently utilizes JavaScript to load key content on the store. While this isn’t inherently a negative thing for SEO, it is something you’ll want to be sure you’re reviewing.
If JavaScript is required to load key content on a page, this means that Google must perform a two-step indexing process where it processes the initial HTML, and then must return to the site to render any content loaded via JavaScript. Where SEOs need to check is in the second stage of the indexing process, to ensure that Google was able to “see” all of the content that is on the page. If any elements are loaded via JavaScript, it’s worth checking whether they’re indexed.
For instance, here’s an example of a product page in Magento where JavaScript is enabled in the browser. We can see thumbnail images, text in tabs, and a related products section at the bottom:
JavaScript enabled on a product page for a clothing site.
However, most of that content is reliant on JavaScript to load. When turning JavaScript off using the Web Developer extension for Chrome, most of those elements do not render. Notice how we can only see the initial three tabs on the page:
Reduced elements on a page when JavaScript is turned off.
Since JavaScript is required to load a lot of the content on the page, we’ll want to ensure that it’s getting indexed properly. Fortunately, we can use tools such as The Mobile Friendly Testing Tool and The Rich Results Test to determine what Googlebot is able to render on the page.
We also like to manually check the index by identifying content that’s loaded via JavaScript, and then using a “site:” search operator to verify that Google is able
Technically, this is not best practice from an SEO Seznam e-mailů B2B standpoint. Canonical tags should only be used to consolidate duplicate content. Since paginated content are not duplicates of the root versions (as they contain different products), they should not have canonical tags that point to this version. Instead, every page within the pagination series should have it’s own self-referential canonical tag. This will tell Google that the paginated URL contains unique content and should be crawled accordingly.
You might need to have a developer create a custom solution that allows the site’s pagination to utilize self-referential canonical tags instead of pointing to the root category page.
Indexable internal search pages
Another Magento SEO issue is that internal search pages are indexable out of the box. This means that Google can crawl and index these low-quality pages. These pages will generally be in the /catalogsearch/ URL path.
For example, here’s a Magento site where over 4,000 internal search pages have gotten caught in Google’s index:
Internal search pages indexed in Google search results.
In order to ensure that these pages don’t get indexed by Google, you’ll want to be sure the “noindex” tag is applied to them. We recommend having a developer implement this for you and providing this article as a reference point for them.
After you’ve implemented the “noindex” tag, you’ll want to be sure that none of your internal search URLs are actually getting indexed. Perform a search for “site:example.com inurl:/catalogsearch/”. If you see URLs appearing in the index, we recommend waiting until Google removes the majority of them. If you don’t see the URLs in the index, you might consider blocking them by using a robots.txt command.
Robots.txt
Within Magento, you can also configure the robots.txt file. You’ll want to utilize the robots.txt file in order to limit how many pages of your Magento site that Google is eligible to crawl. This is especially important to configure if your site utilizes a faceted navigation that allows users to select from a variety of attributes.
Fortunately, Magento does allow you to control the robots.txt of your site. To do this, you can perform the following steps:
In the Admin sidebar, navigate to Content > Design > Configuration
Find the “Store View” you want to adjust and select “Edit”
Expand the “Search Engine Robots” dropdown
Add your robots.txt commands in the “Edit custom instruction of robots.txt File” field
How you adjust the robots.txt is going to depend on your particular store. Unfortunately, there is no one-size-fits-all option here. The main objective will be to block the crawling of any low value pages (that aren’t indexed) while allowing the crawl of high priority ones.
Below are some general things you might consider blocking in the robots.txt:
Low value pages created by the faceted navigation and sorting options
The site’s internal search pages
Login pages
The user’s shopping cart
Sitemap.xml
Sitemap.xml files ensure that Google has a pathway of discovering all of your site’s key URLs. This means that regardless of the site’s architecture, the sitemap.xml gives Google a way of finding important URLs on the site.
Fortunately, Magento has the capability of creating a sitemap.xml file and does a good job of this in it’s default settings. You can technically configure the XML sitemap settings in Magento’s “Catalog” menu. However, most of these should be okay.
While these settings are configured, you might need to generate your sitemap.xml file so it will actually be published on the site. Fortunately, that process is very straightforward. You can do this by:
Navigating to Marketing > SEO & Search > Site Map
Click the “Add Sitemap” button
For “Filename” add the text “sitemap.xml”
For “Path”, choose the URL path you want to be associated with your sitemap.xml file. This is generally at the “/pub/” URL path
Click “Save & Generate”
Setting up a sitemap.xml on Magento.
This should correctly set up your sitemap.xml on Magento. You’ll then want to be sure to submit your sitemap.xml file to Google Search Console so Google can discover your sitemap.xml file.
2. JavaScript rendering
Something else that you’ll want to be mindful of on Magento sites is any content that is loaded through JavaScript. Magento frequently utilizes JavaScript to load key content on the store. While this isn’t inherently a negative thing for SEO, it is something you’ll want to be sure you’re reviewing.
If JavaScript is required to load key content on a page, this means that Google must perform a two-step indexing process where it processes the initial HTML, and then must return to the site to render any content loaded via JavaScript. Where SEOs need to check is in the second stage of the indexing process, to ensure that Google was able to “see” all of the content that is on the page. If any elements are loaded via JavaScript, it’s worth checking whether they’re indexed.
For instance, here’s an example of a product page in Magento where JavaScript is enabled in the browser. We can see thumbnail images, text in tabs, and a related products section at the bottom:
JavaScript enabled on a product page for a clothing site.
However, most of that content is reliant on JavaScript to load. When turning JavaScript off using the Web Developer extension for Chrome, most of those elements do not render. Notice how we can only see the initial three tabs on the page:
Reduced elements on a page when JavaScript is turned off.
Since JavaScript is required to load a lot of the content on the page, we’ll want to ensure that it’s getting indexed properly. Fortunately, we can use tools such as The Mobile Friendly Testing Tool and The Rich Results Test to determine what Googlebot is able to render on the page.
We also like to manually check the index by identifying content that’s loaded via JavaScript, and then using a “site:” search operator to verify that Google is able