PrimeAgile™ Content Management System (CMS) Sitemap

Each site has a sitemap which helps search engines index the pages on that site.  The sitemap contains a list of the pages, when they where updated and other information as outlined below.

By default all PrimeAgile sites have a sitemap located at /sitemap.xml.  For example: http://www.primeagile.com/sitemap.xml.

Sitemap Auto Generation

The site maps on PrimeAgile are auto generated every night. A site map also can be generated by clicking on a sitemap button.

Each site map adheres to the sitemap protocol recognized by all search engines. More about the site map protocol can be found at: http://www.sitemaps.org/protocol.html.

Sitemap Auto Submission

PrimeAgile CMS Site Maps are automatically submitted to Google and Bing. Every night when sitemaps are generated, a check is done on the sitemap to determine if it has changed since it was submitted last. If it has changed, then it is re-submitted.

Sitemap Details

Sitemap URL Set

This encapsulates the file and references the current protocol standard.

Inside the URL are multiple URL entries. One for each page unless the page is marked to be removed or left out of the site map.

NOTE: Google will include in their index anything that is listed in the sitemap. Sometimes there are pages that are of little importance or that are used for administrative purposes or are applications that are specific for users, like a login page. In these cases, though the page may exist on the site and be navigable to the public, it may not be something that we would want to use as a landing page. In this case, the page would be marked so as not to be included in the site map.

Excluding pages from the sitemap does not mean they won't be indexed. It only means we are not asking them to be indexed. Additionally, these pages that we don't want indexed should go into the robots.txt file.

Sitemap Location Attribute

http://www.primeagile.com/cms/

Each location attribute per the standard includes or starts with the protocol http: and uses the full canonical URL which means it also includes the www as we have decided to include that as the canonical url for each site. Additionally each page is followed by the trailing slash, though it is not required, like it is required with some CMS systems, it is the full URL.

Each location item by definition of the protocol must be less than 2,048 characters. A check to make sure that the URL will be smaller than this is done during the creation of each page to guarantee that the URLs will fit within the specifications.

NOTE: When each page is created the url is evaluated at creation time to make sure that it will comply with the sitemap protocol.

Sitemap Last Modified

Each time a page is modified the modification date of that page is recorded. This data is used to provide the last modified date for a URL entry in the site map. Each date is listed in the W3C Datetime format. Though this is an optional parameter it is included with each entry because it gives an indication to the search engines that they may need to update their records, as well as helps searchers when they are searching for time based content - say something modified in the last 4 weeks for example.

Sitemap page Change Frequency

By definition this is an optional field but we include it for each page.  Crawlers will crawl based on their time frame not on what it says in this field, but this gives them a hint of when to check back for fresh content.  Valid options are:

  • always
  • hourly
  • daily
  • weekly
  • monthly
  • yearly
  • never

currently each page is set to hourly by default as we consider if this should be set automatically based on page save history - actual results, or whether it is something that should be modified by the user manually on a page by page basis.  Does it have enough value to make it worth a users time to update it or should it just be automatically done?

Archived URL's should be marked never and pages that change on each visit should be marked always so we see some value in having it set on a per page basis.

Site Map Priority Tag

The priority is relevant only in comparing one page against another page inside of a particular site. It can be used to specify that one page is more important to users than another page might be. As the priority is relative to other pages on the site some benefit may be given to placing a higher priority on some pages and a lower to other pages in terms of how they are indexed. But remember it only effects how the pages on this site may be indexed relative to each other and has no bearing on overall placement results in terms of pages outside of this url.

NOTE: Each page has a sitemap settings section which allows it to be prioritized. The priority of the page also appears in the search results for pages. Pages can be ordered by priority.

Auto Submission to the major search engines

Each night after the sitemap is created, it is automatically  submitted via an HTTP request to Google, Bing(yahoo), Baidu, accounting for over 99% of all searches as of April 2013.

Sitemap Success

While a sitemap is only one small element of successfully implementing and maintaining a website, we make sure that it is done properly and maintained properly.  If an error occurs when generating a sitemap, or submitting a sitemap to search engines, our technical staff are notified and resolve the problem so that months don't go by without anyone knowing there was a problem.

Multilingual and Multinational Site Annotations in Sitemaps

Adding localization annotations to a site map:

  http://www.example.com/en
  http://www.example.com/de

as found here:

http://googlewebmastercentral.blogspot.com/2012/05/multilingual-and-multinational-site.html.