An XML (Extensible Markup Language) Sitemap is a text file used to detail all URLs on a website. It can include extra information (metadata) on each URL, with details of when they were last updated, how important they are and whether there are any other versions of the URL created in other languages. All of this is done to help the search engines crawl your website more efficiently, allowing any changes to be fed to them directly, including when a new page is added or an old one removed.
There is no guarantee that an XML Sitemap will get your pages crawled and indexed by search engines, but having one certainly increases your chances, particularly if your navigation or general internal linking strategy doesn’t link to all of your pages.
<urlset> - The Sitemap opens and closes with this tag. It is the current protocol standard.
<url> - This is the parent tag for each URL entry.
<loc> - This tag contains the absolute URL, or the locator of the page.
<lastmod> - This contains information about the file’s last modified date. It should be in YYYY-MM-DD format.
<changefreq> - This contains information about the frequency with which a file is changed.
<priority> - This indicates the file’s importance within the site. The value ranges from 0.0 to 1.0.
<xhtml:link> - In this case, this tag is used to provide details of alternate URLs offered in other languages.
NOTE:
The loc tag is compulsory, while the lastmod, changefreq and priority tags are optional.
Ideally, an XML Sitemap should be added to the root directory of the website. All URLs in the Sitemap must come from the same host.
Only the canonical version of all page URLs should be included, so pages should not redirect or return an error status.
The maximum length of the URLs is 2,048 characters.
While it may seem possible to manipulate search engines into thinking the content on your page is frequently updated by declaring the changefreq tag daily, it is not advisable to do so. If the frequency and priority tags do not reflect reality, chances are that search engine crawlers will ignore them.
All URLs in the Sitemap must come from the same host.
If you need help building your sitemap, there are several sitemap generator tools to help.