WebWord.com


If you want to know when new content is added to the site,
subscribe to the WebWord.com Usability Newsletter!

WebWord Weblog Posting

Posting Date: September 13, 2002
 

Search Engine IP Addresses -- "Search engine IP addresses are handy to know for a number of reasons. You might want to see when your web site was spidered by search engines. You might want to know which pages where spidered. You might want to ban some bots and let others in. You might want to see which bots are obeying your robots.txt files. You might want to redirect bots to other pages or serve them different content (cloaking)." (Comments: Should be quite useful to some folks!)

 

  

Reader Comments...
 

I read my server logs daily in raw form, and they show the user agents (and when it says "Googlebot" it's pretty obvious!), but frequently the field includes a url where you can learn more about the agent and determine whether or not you want it to have access. For instance, I don't generally like users to make local copies of my web site; but there is an academic resource out there that indexes sites to better check for student plagiarism, in which case I'm happy to let them in.

But additionally, knowing the IP addresses of the search engines can help also because you may want content to be behind a firewall (with access only for those who pay a fee), but not want to charge search engines, so they can alert users to the existence of your content. (And yes, you can stop Google from caching your content.)

Posted by: Frank on September 16, 2002 02:05 PM

 

Home | Moving WebWord | Cool Books | Hot Web Sites
Newsletter Archive | Services | Interviews | About WebWord.com

Subscribe to Webword.com
Receive the best free usability newsletter on the Internet.

 


URL: http://webword.com/weblog/

©1998-2005 by WebWord.com. All rights reserved.
Do not reproduce or redistribute any material from this document,
in whole or in part, without explicit written permission from WebWord.com.