Configuration - Crawling Problem
« on: July 24, 2011, 12:40:23 AM »
We've noticed that if the 'Starting URL' contains 'www', the sitemap generator does not stay within the starting directory, and jumps to directories it is not supposed to crawl.  Removing the 'www' fixes this.

Problem is google webmaster site is setup with www in the domain, so Google complains sitemap should have www in the url.  Please advise...thx
Re: Configuration - Crawling Problem
« Reply #1 on: July 24, 2011, 01:58:44 PM »
Hello,

you can use "Exclude URLs" setting to tell generator not to crawl specific directories.
Re: Configuration - Crawling Problem
« Reply #2 on: July 24, 2011, 02:13:34 PM »
We are using exclude urls.  Problem is this setting is ignored when we have www in starting url.  Remove www, and generator honors exclude urls, and doesn't go beyond the directory.

Also, this problem happens when we're crawling a subdirectory, and don't want to crawl below the directory...thx
Re: Configuration - Crawling Problem
« Reply #3 on: July 24, 2011, 02:25:16 PM »
Hello,

please let me know your generator URL/login in private message to check this.
Make sure that you don't specify domain name in Exclude URLs, i.e. use it like:
folder1/
foder2/
Re: Configuration - Crawling Problem
« Reply #4 on: July 24, 2011, 04:50:56 PM »
sent private message