A sitemap is a sort of contents page for your website containing the URLs of all the sub-pages on a domain. A sitemap tells Google and other search engines all about the structure of your website and can also contain important meta-data such as when a page was last updated, how often it is edited and how important individual pages are.
It’s also possible to create a video or image sitemap in order to make these media clearly accessible to Google too. A video sitemap features information such as video length, category and age whilst an image entry can contain subject, style and licensing info.
How important are sitemaps?
Google has communicated on numerous occasions that well-structured websites don’t necessarily need a sitemap because Google can already crawl them fairly easily. This is the case when sub-pages are well-connected with good internal links and when the website is well-structured.
In the following cases however, a sitemap can prove very useful:
- Wide-ranging, complex websites. Here, Google’s crawlers can easily overlook new or recently updated pages.
- Websites with a substantial number of archived pages which aren’t necessarily inter-connected. If you have pages within your website which don’t link to each other, make sure these are listed in the sitemap so that Google knows where to find them.
- New websites with few external backlinks. The Googlebot works by following links from page to page. If your website is new and doesn’t have many links leading to it, Google could have trouble finding it.
- Websites with rich media content, Google news or other sitemap-compatible features. Google is often able to recognize such additional information in sitemaps.
There are several different ways to produce a sitemap and also various different formats which are recognized by search engines.
The most common format is .xml – the standard sitemap file type. Similar to schema.org, it is accepted by all search engines.
Here’s an example of an .xml-style sitemap:
More info: https://www.sitemaps.org/protocol.html
RSS, mRSS and Atom 1.0
Many content management systems such as Joomla or WordPress provide RSS or Atom 1.0 feeds. These are also accepted by Google and can be registered in the Google Search Console.
Although the scope of information delivered to Google is somewhat limited, simple text sitemaps in .txt format (UTF-8 coded) are also possible. Text sitemaps (possible file name could simply be sitemap.txt) should just contain a list of URLs, each on its own separate line.
The following sitemap rules and tips apply:
- Be consistent! Google crawls your URLs exactly as you have entered them in the sitemap, so avoid using relative paths or omitting the www. from www. domains!
- Leave session IDs out.
- Make Google aware of translated versions of the same URL using a canonical URL and the tag: hreflang
- Divide large sitemaps into several smaller sitemaps so as not to overload the server when Google comes calling. A sitemap file should contain no more than 50,000 URLs and should not exceed 10 MB (uncompressed).
- Use a sitemap index file to list all your maps in one place and register this file with Google, rather than multiple individual sitemaps.
Path und tricks in robots.txt
A sitemap should always be found in your domain’s root directory. An .xml sitemap for instance should be accessible via www.domain.com/sitemap.xml to enable Google to locate it quickly. You can also help the crawler by including the sitemap’s path in your website’s robots.txt file.
This shows the crawler where the sitemap is saved – whether it’s in the root directory or not.
Registering your sitemap with Google
For further support, it’s advisable to register your sitemap with Google via the Search Console, where it’s possible to upload video and image sitemaps as well.
You need a free Google account first and you need to have registered your domain with the Google Search Console. You also need to provide proof of ownership of the website. Once ownership has been ratified, you can register your sitemap(s) under “Crawl” > “Sitemaps.”
This takes you to the sitemap set-up page:
When Google encounters any problems or errors whilst crawling a sitemap, these are flagged up in the form of warnings. Click on the links to see a more detailed description of the error. Additional sitemaps can be added using the red button in the top right. Existing sitemaps can be tested for errors.
Sitemaps aren’t absolutely necessary for small, simple websites. But they are highly recommended for large websites or big online shops. Remember however that a sitemap should not contain more than 50,000 URLs and should be divided up. The data contained in a sitemap can provide the search engine with important information about the website whilst helping the crawler to understand the domain’s structure. Sitemaps aren’t always a guarantee of better rankings but Google has made it clear that they certainly don’t hurt! Setting up a sitemap and registering it in the search console is a simple and easy task – anyone can do it and we recommend you do too! Good luck!