Cross site scripting with Googlebot
There are several forums with a password protected area that only allow searchbots to enter it without logging in. If you want to place a comment or do something else on such a forum you can have the searchbot do it for you.
How can you get Googlebot to hack for you?
POST to GET
Most forums accept commands from both POST and GET methods. When a normal visitor sends his command (for instance posting a comment) it’s mostly with the POST method, but if you convert the same formfields to GET variables the script will still run.
The webdeveloper plugin for Firefox has a simple function to convert all POSTs to GETs (Forms -> Convert Form Methods -> POSTs To GETs).
Searchbots
Searchbots follow links (they grab the URLs you link to and que them for spidering). This means every URL you link to will be visited by for instance Googlebot.
Cloaking to provide access
Some websites use a database that contain IP adresses and hostnames of most searchbots to show other content to these bots then to normal visitors (aka cloaking). Because searchbots don’t use cookies, some content might be unreadable by them. You can detect if a visitor is a bot and let them bypass the need of a cookie (or password). This form of “cloaking” is condoned by Google.
Let’s put it all together
- Find a forum that is indexed by Google, but restricted for you.
- Find out which URL posts a comment on that URL. Find out what forum software is used if you can’t visit the forum itself, so you know which variables are needed. Use the webdeveloper plugin for Firefox if you want to easily convert POSTs to GETs.You might get something like:
http://www.vdgraaf.info/wp-comments-post.php?author=Peter&email=peter%40vdgraaf.info&url=&comment=This%20blog%20sucks&submit=Submit+Comment&comment_post_ID=158 - Link the URL from a place Google visits, but make sure the link will not be clicked by normal visitors.
- And there you have it:
Googlebot has placed an untracable comment (at least not to you) in a restricted area!
November 27th, 2006 at 3:45 pm
shit !~
Ik dacht dit ik weer eens een geniale ontdekking had gedaan , staat ie drie weken later publiekelijk !@$@#$%
February 20th, 2007 at 1:38 pm
I still don’t get why this form of cloaking is considered white hat.
As a searcher, it’s so frustrating constantly running into webmasterworld results in the serps that are password protected. (I know there is a workaround, but I generally just go somewhere else).
Nice article though, an interesting concept.