LoginRegister Welcome Guest!

OneCMS :: Viewing Topic - Robots.txt File
Home > Documentation    

Mehar
Queen

Posts: 8
Rep Level: 1
Posted: 11-21-09 06:17 PMSubject: Robots.txt File

OneCMS 2.6 saw with it the inclusion of a robots.txt file from the get go! You may have a few questions about what it does, if you can edit it, etc.
 
What is it?
 
The robots.txt standard was first created in the 90s, to tell "robots" what parts of your site they can and can't visit, you can set the rules to apply to all bots or only some via the "User-agent:" field. By default the field is set to tell all search engines they can't crawl parts of your site, if you want the rules to apply to only certain bots feel free to do a Google, Bing, Ask, etc search for the appropriate changes.
 
Keep in mind, regular users can freely move about your site assuming they have the appropriate privelleges to do so, the robots.txt file will not block regular users from using your site.
 
Can the file be bypassed?
 
The concept of a robots.txt file involves around the honor system, by telling the bots something it's considered good will that they respect your guidelines. Some robots can be programmed to ignore the robots.txt file and search your entire site anyway, most of the time these are hackers or spammers looking for ways to hack your site or to crawl email addresses. If you access the file directly (ex,
http://yoursite.com/robots.txt" target="_blank" class="bbc_link new_win">http://yoursite.com/robots.txt) people will be able to view your file and the directories you want to stop so it isn't recommended to hide secure files.
 
Can I add to my file?
 
Yes, I won't go into detail here but these sites should help.
 
http://www.robotstxt.org/">http://www.robotstxt.org
http://en.wikipedia.org/wiki/Robots_exclusion_standard">http://en.wikipedia.org/wiki/Robots_exclusion_standard
http://www.invision-graphics.com/robotstxt_validator.html">http://www.invision-graphics.com/robotstxt_validator.html
 
Is it necessary?
 
While the file isn't necessary, it is recommended, if you want to remove the file from your OneCMS installation simply delete it from your server and you're good to go. If you modified it, we recommend creating a backup incase you want to go back to having a robots.txt file "protect" your site.
 
Questions, comments, etc? Feel free to post below!


---------------


Back to Top
         
^Last Login: Logged in for: Skin: Forum Jump: