I have a one page website which includes an English main page, and a French Main Page. One can access my website through the following URLs:
ENGLISH VERSION OF MAIN PAGE
www.example.org
www.example.org/index.html
example.org
example.org/index.html
FRENCH VERSION OF MAIN PAGE
www.example.org/fr
www.example.org/fr/index.html
example.org/fr
example.org/fr/index.html
For optimal search engine indexing, should I include all of these URLs in my sitemap (with both http://
and https://
)? If not, what would be the set of URLs I should include in my sitemap.xml file?
2
Answers
You should include all unique pages in your sitemap once.
All of the different URLs you listed are just different ways of accessing the same page/content, just like most PHP applications can be accessed via
site.org/
orsite.org/index.php
. Your sitemap should include just one reference to a page.The best practice is to have one canonical URL per document. And each canonical URL should be added to your sitemap (if you have one).
So in your case you may want to use one URL for the English main page and one URL for the French main page, and redirect (with HTTP status code 301) from the other URLs to the canonical ones. In addition, you can declare the canonical URL with the
canonical
link relation.If you need to provide HTTP in addition to HTTPS (instead of enforcing HTTPS), you would of course need to have two URLs per document (one with HTTP, one with HTTPS). But you [should only list one variant in the sitemap](http://www.sitemaps.org/faq.html#faq_http_vs_https "Sitemaps.org FAQ: ‘My site has both "http" and "https" versions of URLs. Do I need to list both?’"), and you should only declare one as
canonical
(ideally the same which you added to the sitemap).Which URLs to choose can depend on various factors (usability, SEO, your backend, …), but it seems safe to assume that
index.html
is ballast. You’d have to decide if to use thewww
subdomain (a common convention) or not. Assuming that you choose to omit it, you could have these canonical URLs:And you would redirect the following URLs with 301 to the canonical URLs listed above: