what pages should i set to crawl?
« on: October 06, 2007, 06:21:27 PM »
if my website is [ External links are visible to forum administrators only ] is it best to only crawl the pages in this directory, and not the pages in other folders that are more for admin etc, how do i set it to only crawl .php files just inside [ External links are visible to forum administrators only ] and not in any other folders like [ External links are visible to forum administrators only ] , [ External links are visible to forum administrators only ] , [ External links are visible to forum administrators only ] etc etc
Re: what pages should i set to crawl?
« Reply #1 on: October 07, 2007, 11:50:50 AM »
Hello,

yes, it is a good idea to exclude those folders form the sitemap. You can simply add that in "Do not parse" and "Exclude URLs" options:
folder1/
folder2/
folder3/
Re: what pages should i set to crawl?
« Reply #2 on: October 07, 2007, 07:17:03 PM »
do i set "Maximum depth level:" to 1 ? when i leave it to "0" zero, theres too many pages its trying to crawl, if i set it to 1, its not adding all the pages to the site map.
Re: what pages should i set to crawl?
« Reply #3 on: October 07, 2007, 09:58:52 PM »
Below are some of the pages that i need to exclude from the sitemap crawl, do I type it as it is below in the Exclude URL's ? What about folder names with spaces inbetween? do i just put "  Antler 2007/  " and a forward slaxh after the folder name?

_overlay/
_themes/
Aluminium Cases Oct 07/
Antler 2007/
Baby Landing Feb 07
beauty_landing/
Branding/
brands/
briefcase landing/
briefcase landing 2/
briefcase right menu/
briefcases/
driggs/
Riley landing/
business landing/
Carlton/
big images/
landing images/
blue-luggage/
catimages/
Landing may 29th/
cgi-bin/
Folders Final/
conference-folders/
Re: what pages should i set to crawl?
« Reply #5 on: October 09, 2007, 08:48:35 AM »
DO I havw to put a "/" forward slash at the end of it? Also what depth level do i set it to? leave it as "0" zero or put 1? Also how can I stop a current crawl that is looking to crawl my whole site, which I do not want it to do, i only want it to crawl the files in main directory.
Re: what pages should i set to crawl?
« Reply #6 on: October 09, 2007, 10:52:59 PM »
Hello,

you can omit ending slash, Sitemap Generator will exclude all URLs that include specified line as a substring anyway.

Ideally, you should have depth level set to 0 (unlimited), but you can set it to certain limit - 5, for instance or lower depending on the site.