Using robots.txt file with sitemap.xml
« on: March 16, 2008, 06:57:38 PM »
Does anyone know if it is necessary to have the sitemap.xml document in sync with the robots.txt file? Or will the crawlers disregard and not crawl any address that is stated as "disallow" in the robots.txt file, regardless of whether it is listed in the sitemap.xml document or not?
Re: Using robots.txt file with sitemap.xml
« Reply #1 on: March 16, 2008, 08:06:40 PM »
Hello,

oour sitemap generator (both free online and standalone version) support robots.txt protocol and do not include disallowed URLs in sitemap.
Re: Using robots.txt file with sitemap.xml
« Reply #2 on: March 17, 2008, 01:04:56 PM »
Thank you. One more question ... is it necessary to place the robots.txt file in the root directory of the site before running your XML Sitemap generator for that same site?
Re: Using robots.txt file with sitemap.xml
« Reply #3 on: March 17, 2008, 10:45:36 PM »
If you are planning to partially restrict crawling of your site in robots.txt, then you should create that file before generating sitemap so that our sitemap generator can read it and follow the rules you defined.