We recommend using both traditional and search engine sitemaps.
The Yahoo! sitemap protocol is less popular than the Google protocol, but this chapter demonstrates
code that allows both to be created with the same PHP code. Thus, it is worthwhile to support both for-
mats, because it will require minimal effort. Because the Yahoo! sitemap protocol uses only a subset of
the information that the Google sitemap protocol does, if provided, that information will simply be
ignored when the Yahoo! sitemap is created.
Google and Yahoo! both also support reading news feeds in the formats of RSS and Atom. These for-
mats may suffice for blogs and certain content management systems; often, they are provided by such
applications by default. The problem with these implementations is that they usually only enumerate
the newest content, and this is only really suitable for a blog. If you are doing search engine optimiza-
tion for a blog, feel free to skip this chapter and use the feed functionality provided by your blog appli-
cation instead. Also, theoretically it would be possible to create an RSS or Atom feed with all URLs as
a sitemap for a site that is not a blog, but this is probably not what Yahoo! or Google expects, and we
would not recommend it.
Using Google Sitemaps
Google has a very elaborate standard for providing a sitemap. It allows a webmaster to provide informa-
tion in several formats, but the preferred format is an XML-based standard specified by Google. Google
claims that using Google Sitemaps will result in “a smarter crawl because you can tell [them] when a page
was last modified or how frequently a page changes.” For more information regarding Google Sitemaps,
. There is also a Google-run Sitemaps blog at
However, according to Google, “using this protocol does not guarantee that your web pages will be
included in search indexes,” and “… using this protocol will not influence the way your pages are
ranked by Google.” Creating a sitemap for your site entails the following:
Creating a Google account, if you don’t have one:
Creating a sitemap file.
Let’s also note one lesser-known benefit of using sitemaps — mitigation of the dam-
age as a result of content theft and scraper sites. Unfortunately, on the web there are
unsavory characters who, without permission, lift content from your web site and
place it on theirs.
These sites are called most affectionately “scraper sites,” but when it happens to
you, they’re called much less affectionate terms. One of the most difficult challenges
search engines face is assigning the original author of content that is duplicated in
several places. As discussed in Chapter 5, search engines aim to filter duplicate con-
tent from their indices. When you get filtered as a result of scrapers stealing your
content, it can be particularly difficult to resolve. If a well-ranked scraper site (they
do exist) gets spidered with your content before you do, your web site content may be
deemed the duplicate! Because search engine sitemaps will get your new web pages
spidered more quickly, they may help in avoiding some of these content-theft snafus.
Chapter 9: Sitemaps
c09.qxd:c09 10:43 201