Generator Removing URLs - Not Crawling New Pages
« on: May 09, 2007, 12:26:52 PM »
Hi
Im using my generator for some time now and everything was working fine until may.

It has removed over 200 urls and dont want to crawl it back...

it looks like that

Pages scanned: 220 (5,181.4 Kb)
Pages left: 30 (+ 186 queued for the next depth level)

then it drops all the next level queued pages and finishes with 246 pages crawled for the sitemap!!

its not crawling new entries into the directory

it has removed the links to all the entries from the directory
it is still crawling all the other pages (categories, subcategories, search results)

[ External links are visible to forum administrators only ]

Ive got few other pages on exactly the same script and generator is working fine there.
it is crawling all my links fine on [ External links are visible to forum administrators only ]

Ive also tried  other free generator from  [ External links are visible to forum administrators only ]
and it is working fine, crawling all the pages

ive updated the script to newest version
ive tried to reinstall the script
ive tried to use your free generator

still no joy

Please help

mike

t_a

*
  • *
  • 13
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #1 on: May 09, 2007, 11:26:15 PM »
I am having the same issue :-(
Thomas A.

t_a

*
  • *
  • 13
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #3 on: May 10, 2007, 12:28:25 AM »
I am using these excludes:
rss.php?c=
submit.php?c=
?s=
authors?Page=
?Page=
?ArticleId=
addfav
addread
print

As for dept level.. have all to use "0" unlimited there.

Here is an example of a removed page: (thousands of articles has been removed during the last week or so)
[ External links are visible to forum administrators only ]

Any suggestions.
Thomas A.

t_a

*
  • *
  • 13
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #4 on: May 10, 2007, 07:11:25 PM »
Here is a page with some 404's that should not be: [ External links are visible to forum administrators only ]

The strange thing here is that some articles are added to the sitemap.

Example of added link:
[ External links are visible to forum administrators only ]

Example of link returning 404:
[ External links are visible to forum administrators only ]

I just cant seem figure this out.

I welcome any help I can get here.
« Last Edit: May 10, 2007, 07:18:45 PM by t_a »
Thomas A.
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #5 on: May 11, 2007, 10:33:41 PM »
Hello,

perhaps your site returned an error for some URLs because of crawling intensity. Try to define a delay between requests in sitemap generator configuration.

t_a

*
  • *
  • 13
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #6 on: May 12, 2007, 08:19:56 AM »
I tried a 3 second delay for each 10 requests, but that did not help. Could it be something else?

Do you want login information?
Thomas A.
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #8 on: May 14, 2007, 11:05:34 PM »
Hi
sorry for the delay

I dont have any limitations at all. I use 0 to have all web page crawled.

I have tried suggested 1 sec break between every request...


Still no joy !!! it has even removed my last entry in to the directory ( I lost 5 urls)

all the other pages are being crawled fine

Admin - please help... any other ideas?

regards

mike
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #9 on: May 15, 2007, 01:09:13 AM »
Hello,

please send me a private message with your generator URL and example URL that is not included in sitemap.

t_a

*
  • *
  • 13
Re: Generator Removing URLs - Not Crawling New Pages
« Reply #10 on: May 23, 2007, 04:05:25 PM »
I sent you a PM. I am missing appox 5000 pages in my sitemap.

Thanks.
Thomas A.