<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>link love &#187; Cloaking</title>
	<atom:link href="http://www.vdgraaf.info/category/cloaking/feed" rel="self" type="application/rss+xml" />
	<link>http://www.vdgraaf.info</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Thu, 28 Jan 2010 08:37:35 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Canonical tag creates new legal cloaking possibilities</title>
		<link>http://www.vdgraaf.info/canonical-tag-creates-new-legal-cloaking-possibilities.html</link>
		<comments>http://www.vdgraaf.info/canonical-tag-creates-new-legal-cloaking-possibilities.html#comments</comments>
		<pubDate>Thu, 25 Dec 2008 14:41:59 +0000</pubDate>
		<dc:creator>Peter van der Graaf</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Cloaking]]></category>

		<guid isPermaLink="false">http://www.vdgraaf.info/canonical-tag-creates-new-legal-cloaking-possibilities.html</guid>
		<description><![CDATA[Google, Yahoo and Live search have introduced a great new way to serve linkers different content than other visitors or search engines. Thanks to the rel=canonical tag the search engines are now supporting they will need to implement many new spam detection methods as well.
If you do not yet know about the tag please read [...]]]></description>
			<content:encoded><![CDATA[<p>Google, Yahoo and Live search have introduced a great new way to serve linkers different content than other visitors or search engines. Thanks to the rel=canonical tag the search engines are now supporting they will need to implement many new spam detection methods as well.<br />
If you do not yet know about the tag please read Rand Fishkin&#8217;s <a href="http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps">rel=canonical post</a> or the information from Google on the <a href="http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html">new canonical tag</a>.</p>
<p><span id="more-120"></span></p>
<p><strong>How does it work?</strong><br />
With the new canonical tag you indicate that there might be more than one URL <em>within your website</em> presenting this specific information and if search engines want to choose between those when presenting it to visitors, you would prefer the given URL. I for instance use wordpress to rewrite to nice URLs but this post is available under http://www.vdgraaf.info/?p=120 as well.</p>
<p>If I would place <code>&lt;link rel="canonical" href="http://www.vdgraaf.info/canonical-tag-creates-new-legal-cloaking-possibilities.html" /&gt;</code> in the head of this page, Google would only show that one to visitors. But just like a 301-redirect Google would attribute all linkjuice to the remaining page.</p>
<p>The canonical tag is intended for sites that cannot do a proper 301 redirect because of several reasons. For instance if there still are slight differences between the pages under different URLs, but too little for Google to see them as unique. The same article might be available under different categories, an affilliate ID in the URL just changes some form values or you use almost the same text for the French speaking population of Canada as you do for people in France. I can think of many more, but all these legitimate reasons allow for slight changes to the page served. And it is logical that search engines now allow you to indicate which version to choose.</p>
<p><strong>One page, different audiences</strong><br />Your website is created for several audiences and in the ideal situation you could change nuances depending on the visitor type. The most important audiences we want to differentiate right now are &#8220;linkers&#8221;, &#8220;navigators&#8221; and &#8220;searchers&#8221;.</p>
<ul>
<li><strong>People linking to your website</strong><br />A clean webpage without much commercial intent is far more likely to recieve inbound links than a page with a lot of branding and call-to-actions. A page that shows your good side (for instance with a reference to your altruism to good causes, or a reference to the specific link partner) will give you a far higher success rate on your link building effort. Use a URL like http://www.vdgraaf.info/i-am-a-good-boy.html for them.</li>
<li><strong>People navigating from within your website</strong><br />People that navigate through your website follow a certain path and you can offer more specific content depending on where they came from. http://www.vdgraaf.info/so-you-clicked-the-banner.html would be good for this audience.</li>
<li><strong>People from search engines</strong><br />A good landing page for search engines gives the answer to what you searched for and shows a clear call-to-action to the most logical next step. Besides changing the appearance of the page that you would like search engines to serve, you could also use slightly different code to be more search engine friendly. You should make this version the canonical one by adding <code>&lt;link rel="canonical" href="http://www.vdgraaf.info/this-one-is-for-searchers.html" /&gt;</code> to the head section of all your versions.</li>
</ul>
<p><strong>Does the tag give real cloaking abilities</strong><br />It is logical that Google will or already has implemented a check on simularity between different versions. Because the tag is onpage and not in an external file like the robots.txt, the page has to be spidered first. It is a small extra step to do some text calculation as well. The canonical instruction will probably be followed when the page would normally trigger duplicate content filter. It will however assign all accumulated links to that one remaining occurrence.</p>
<p>So use this tag with care, but with the same content you can still create many different versions. Have fun!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vdgraaf.info/canonical-tag-creates-new-legal-cloaking-possibilities.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How to implement cloaking</title>
		<link>http://www.vdgraaf.info/how-to-implement-cloaking.html</link>
		<comments>http://www.vdgraaf.info/how-to-implement-cloaking.html#comments</comments>
		<pubDate>Sat, 04 Nov 2006 08:59:05 +0000</pubDate>
		<dc:creator>Peter van der Graaf</dc:creator>
				<category><![CDATA[Cloaking]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://www.vdgraaf.info/how-to-implement-cloaking.html</guid>
		<description><![CDATA[Cloaking is showing search engines other content then human visitors. To use cloaking effectively you need to detect whether your visitor is a searchbot.
In this post I will try to explain&#8230;
the process of implementing a cloaking script.

User agent or IP-based detection?
I will always recommend IP-based search engine detection. It is too easy for search engines [...]]]></description>
			<content:encoded><![CDATA[<p>Cloaking is showing search engines other content then human visitors. To use cloaking effectively you need to detect whether your visitor is a searchbot.</p>
<p>In this post I will try to explain&#8230;<br />
<strong>the process of implementing a cloaking script.</strong></p>
<p><span id="more-57"></span></p>
<p><strong>User agent or IP-based detection?</strong><br />
I will always recommend IP-based search engine detection. It is too easy for search engines to detect user agent based cloaking. It is very important!! to have an up to date IP database! I only work with the best IP database providers and always re-synchronise to keep up to date.</p>
<p><strong>Search engine IP databases</strong><br />
Depending on your programming knowledge you can use several IP database providers that always have an up to date database. I&#8217;ve only listed the ones I can recommend. These list every engine by name and bot purpose.</p>
<ul>
<li><a title="Cloaking database" href="http://searchbotbase.com/" target="_blank"><strong>Fantomaster<br />
</strong></a>Fantomaster provides several ways to help you detect search engine spiders. The SpiderSpy service provides both database access and scripting help to make it easy for you to keep your database up to date.<br />
<em>price $258 per year</em></li>
<li><a title="Cloaking database" href="http://www.ip-delivery.com/" target="_blank"><strong>IP-delivery.com</strong></a><br />
A quicker script then Fantomaster and  a guarantee on being up to date. But for almost four times the price I&#8217;ve never tried it. Some friends swear this is the best.<br />
<em>price $995 per year</em></li>
<li><strong><a title="How to verify Googlebot" href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html" target="_blank">Reverse DNS lookup</a></strong><br />
You can also check every IP address visiting you and see if they belong to the Google domain. This article provides the Google explanation <a title="Googlebot detection" href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html" target="_blank">how to detect their crawling spiders</a>.</li>
</ul>
<p><strong>Using default cloaking scripts</strong><br />
Fantomaster and IP-delivery.com provide free scripts with their service and you can easily make a function that gives back a simple yes or no result. You can check if the visitor is a search engine and you can also detect each search engine separately. Then you can program the different content you want to show them.</p>
<p><strong>Htaccess example</strong><br />
My favourite way to use Fantomaster SpiderSpy is by making a RewriteMap for mod_rewrite on Apache servers. You implement this by doing the following.</p>
<p>Use the following instructions to generate the RewriteMap:</p>
<ul>
<li>Generate an empty text file with write rights for PHP (for instance rewritemap.txt).</li>
<li>Customize the following php script:
<p class="postmetadata alt"><small>&lt;?php<br />
 <br />
$fantomaster_user = &#8220;&#8221;;<br />
$fantomaster_pass = &#8220;&#8221;;<br />
$rewritemap_url = &#8220;/home/vhosts/yourdomain.com/www/rewritemap.txt&#8221;;<br />
 <br />
function doRequest($user_pass) {<br />
  $ch = curl_init();<br />
  curl_setopt($ch, CURLOPT_URL, &#8216;http://fantomaster.com/dardanelles/registerdb/fabotbasexml.cgi&#8217;);<br />
  curl_setopt($ch, CURLOPT_USERPWD, $user_pass);<br />
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);<br />
  curl_setopt($ch, CURLOPT_CRLF, true);<br />
  $data = curl_exec($ch);<br />
  curl_close($ch);<br />
 <br />
  if ($data) {<br />
      return $data;<br />
  } else {<br />
      return curl_error($ch);<br />
  }<br />
}<br />
 <br />
$rawxml = trim(doRequest($fantomaster_user.&#8217;:&#8217;.$fantomaster_pass));<br />
$matches_host = array();<br />
$matches_ip   = array();<br />
preg_match_all(&#8217;%([^<]*)%&#8217;, $rawxml, $matches_host);<br />
preg_match_all(&#8217;%([^<]*)%&#8217;, $rawxml, $matches_ip);<br />
 <br />
$matches_host = array_unique($matches_host[1]);<br />
$matches_ip   = array_unique($matches_ip[1]);<br />
 <br />
$spidertxt=&#8221;;<br />
foreach($matches_host as $host){<br />
 $spidertxt.=$host.&#8217; spider&#8217;.&#8221;\n&#8221;;<br />
}<br />
foreach($matches_ip as $ip){<br />
 $spidertxt.=$ip.&#8217; spider&#8217;.&#8221;\n&#8221;;<br />
}<br />
 <br />
if($f = fopen($rewritemap_url, &#8216;w&#8217;)){<br />
 fwrite($f, $spidertxt);<br />
 fclose($f);<br />
}else{<br />
 echo &#8216;failed to write&#8217;;<br />
}<br />
 <br />
?&gt; </small></li>
<li>After running this script you should have a textfile with IP adresses and hostnames followed by &#8220;spider&#8221;. This textfile must be specified in your httpd.conf (server wide) or vhost.conf (just for this domain) file to be able to use it in a RewriteCond. The problem with shared hosting is that you can probably not edit it yourself.The conf file should contain the following line:
<p class="postmetadata alt"><small>RewriteMap allBots txt:/home/vhosts/yourdomain.com/www/rewritemap.txt</small></p>
</li>
<li>Then you can address allBots from a rewrite condition in your .htaccess file.RewriteEngine On
<p class="postmetadata alt"><small>RewriteBase /<br />
RewriteCond   ${allBots:%{REMOTE_HOST}} =spider [OR]<br />
RewriteCond   ${allBots:%{REMOTE_ADDR}} =spider<br />
RewriteRule    ^(.*)$  otherurl.php  [L] </small></p>
<p><a title="RewriteGuide" href="http://httpd.apache.org/docs/2.0/misc/rewriteguide.html" target="_blank">More on mod_rewrite and RewriteRules can be found here.</a></p>
<p>The rule above redirects any request from a searchbot to otherurl.php.</li>
<li>Update the list every 12 hours by running a cronjob on the PHP file you generated.</li>
<li>And you&#8217;re done!</li>
</ul>
<p><strong>Cloaking examples</strong><br />
You can probably think of many ways to use cloaking to your advantage. <a href="http://www.vdgraaf.info/when-should-i-use-cloaking.html">Here are some examples</a>.</p>
<p>Whatever you use cloaking for, use it wisely. My tip of the day: Always keep your IP database up to date! If less decent ways of cloaking are detected they will be punished and your site will probably be removed from the index.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vdgraaf.info/how-to-implement-cloaking.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When should I use cloaking?</title>
		<link>http://www.vdgraaf.info/when-should-i-use-cloaking.html</link>
		<comments>http://www.vdgraaf.info/when-should-i-use-cloaking.html#comments</comments>
		<pubDate>Fri, 13 Oct 2006 08:39:17 +0000</pubDate>
		<dc:creator>Peter van der Graaf</dc:creator>
				<category><![CDATA[Cloaking]]></category>
		<category><![CDATA[Tutorials]]></category>

		<guid isPermaLink="false">http://www.vdgraaf.info/when-should-i-use-cloaking.html</guid>
		<description><![CDATA[There are many good and bad ways to use cloaking. Google says there are ways where they approve on its use and they even say a good way to identify the Googlebot is by doing an IP lookup and checking if it is in the googlebot.com domain.
When should you cloak?


Cloaking javascript and CSS
Cloaking to get [...]]]></description>
			<content:encoded><![CDATA[<p>There are many good and bad ways to use cloaking. Google says there are ways where they approve on its use and they even say a good way to identify the Googlebot is by doing an IP lookup and checking if it is in the googlebot.com domain.</p>
<p><strong>When should you cloak?</strong></p>
<p><span id="more-52"></span></p>
<ul>
<li><a href="#javascript">Cloaking javascript and CSS</a></li>
<li><a href="#linkbuilding">Cloaking to get your competitors links</a></li>
</ul>
<p>First some basic questions answered:</p>
<p><strong>What is Cloaking?</strong><br />
Cloaking is showing a search engine other content then normal visitors.</p>
<p><strong>What does Google find plausible reasons to cloak?</strong><br />
When you have a website that is hard to index by a script, you can have an alternative version with the same content available.<br />
Or when you have a password protected area you want Google to index, you can let him in without a password. Just add &#8220;noarchive&#8221; to your robots metatag, otherwise people just look at the Google cache.<br />
There are more reasons, but the rule is: &#8220;If it improves the searchbot crawlability of the same content a user sees&#8221;.</p>
<p><strong>Why should I cloak?</strong><br />
Normally you shouldn&#8217;t! Make good sites with good code and both users and Google should see the same. If your code is unfriendly for search engines, rewrite the code and don&#8217;t cloak.</p>
<p>There are situations where you want to draw a users attention to one place and Googles attention to another. Normally I&#8217;d use images to draw a users attention to my &#8220;call to action&#8221; and headers to draw Googles attention to the most important text. So in most situations you don&#8217;t need cloaking for this.</p>
<p>If you&#8217;re a good boy, don&#8217;t use cloaking! If you&#8217;re a bad boy (or girl), do it!</p>
<p><strong>But there are good spammy tactics?</strong><br />
Of course there are many fun ways to use cloaking to your advantage. Most of the time you wan&#8217;t both linkable content and search engine friendly content. This could become a compromise and you should always want the best of both. Here are some fun tactics with cloaking.</p>
<p><em>I hope Matt Cutts doesn&#8217;t read this post because there are ways to detect the following tactics in an algorithm, they just don&#8217;t detect them yet. My blog is too small to be detected by Matt Cutts (the Google spamcop), but when he does: &#8220;Matt please leave a comment!&#8221;</em></p>
<p><a name="javascript"></a><strong>Javascript and Stylesheet tactics</strong><br />
With external javascript and stylesheets you can change the appearance and visibility of your content. Google sees this and devalues hidden content.</p>
<p><a name="javascript"></a><a name="javascript"></a>I tried to disalow my .JS and .CSS files from being indexed in the robots.txt, but Google disobeys this and still reads them. Then I tried IP cloaking (detecting if an IP belonged to a search engine and showed different javascript and stylesheet information) and it worked.</p>
<p><a name="javascript"></a><a name="javascript"></a>You can hide a block of content, you can show an H1 header as normal inline text, you can hide links and much more all by the use of javascript or css. Currently the detection of cloaking isn&#8217;t very sophisticated and algorithmicly not many spammers are caught. When you get caught on cloaking it is mostly because someone ratted on you to Google. If you just cloak the .JS and .CSS files people don&#8217;t see any difference between the search engine cache and normal file. The cached version still uses the users version of external files.</p>
<p><a name="javascript"></a><a name="linkbuilding"></a><strong>Cloaking for linkbuilding</strong><br />
This way is more dangerous because people can detect it more easily and tell Google about it. But many people ask me: &#8220;How do I get links from my competitors to my commercial website?&#8221; Here is a possible answer to this question!</p>
<p><a name="linkbuilding"></a></p>
<ul><a name="linkbuilding"></a> </p>
<li><a name="linkbuilding"></a>Make a non-commercial website about a subject your competitors might be willing to link to and point them to it. Don&#8217;t let it have any link to you or your commercial activities.<br />
<em>For instance: Results of the top 5 SEO companies competition website.<br />
Email: Congratulations you made 1st place!<br />
Result: They will proudly link to it.</em></li>
<p><a name="linkbuilding"></a></p>
<li><a name="linkbuilding"></a>When they start linking you want to divert the linklove to your commercial activities without loosing the links.</li>
<p><a name="linkbuilding"></a></p>
<li><a name="linkbuilding"></a>Either cloak and place good links to your commercial website or use a 301-redirect to divert all linkpoints.<br />
<em>The 301 redirect shows no cache in Google, thus no trackback to where it is going.<br />
Competitors just don&#8217;t see it show up in Google. From their IP the domain shows the normal site, so no reason not to link to it.</em></li>
<p><a name="linkbuilding"></a></p>
<li><a name="linkbuilding"></a>The links will remain and could even grow, but the love is diverted!</li>
<p><a name="linkbuilding"></a></ul>
<p><a name="linkbuilding"></a><a name="linkbuilding"></a><em>I hope these tips were helpfull. Cloaking is seen as illegal by most search engines and you should try this at your own risk. Everything they don&#8217;t detect at this moment, can be detected in the future.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.vdgraaf.info/when-should-i-use-cloaking.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
