Disabling URL-Based Session Handling for Spiders
As discussed in Chapter 5, PHP’s trans_sid feature, which performs automatic modification of URLs
and forms to include session variables, is used to preserve session state for those users who do not
accept cookies. However, this has the side effect of sending spiders a potentially infinite amount of
duplicate content. For this reason, nowadays many web sites turn this feature off altogether.
However, because this feature can be turned on and off dynamically in PHP code, cloaking can be employed
to dynamically turn it on and off based on whether the visitor is a human or a search engine spider. This
allows the site to both accommodate users, but not confuse a spider when it visits with an infinite number
of semantically meaningless URL-variations containing different session IDs. Using the cloaking toolkit in
this chapter, it would be implemented as follows:
This code should be placed at the top of a PHP script, and before any headers or output is sent to the client.
ini_set (‘session.use_trans_sid’, 0);
ini_set (‘session.use_trans_sid’, 1);
Obviously, your site must also not require a session to be functional either — because search engines will
not accept cookies regardless.
Other Cloaking Implementations
The preceding implementation of cloaking works if you have the source code to your application and you
are willing and able to modify it. If not, there are cloaking toolkits that allow you to easily and dynami-
cally serve different content to various user agents. One such toolkit is KloakIt from Volatile Graphix, Inc.
You can find it at
. It is written by and utilizes the same cloaking data used as
provided by Dan Kramer.
Geo-targeting isn’t very different than cloaking — so you’ll probably feel a little deja vu as you read this
section. After creating the database table
, you’ll create a class named
that includes the necessary geo-targeting features.
Chapter 11: Cloaking, Geo-Targeting, and IP Delivery
c11.qxd:c11 11:01 234