About Crawl and Cron Job
« on: January 17, 2013, 11:22:45 AM »
Hello

I want to know few things

1. Generator crawl and make sitemap first time, when it again run and crawl second time, will it crawl from begin whole site or just add few more links to existing sitemap. Does it remember where it stop last?

2. If i want to run cron Job how much resources it consume?

3. What is the best schedule time for cron Job like daily or after 3 days or Once in a week?

Tanks
« Last Edit: January 17, 2013, 11:24:32 AM by sunilkumar4 »
Re: About Crawl and Cron Job
« Reply #1 on: January 17, 2013, 11:25:04 AM »
1. starts again

2. dunno

3. i find best 1 week. can't be accused of spamming the bots
Re: About Crawl and Cron Job
« Reply #2 on: January 17, 2013, 12:55:01 PM »
Hello,

2.  the same as running it via web interface. Actual resource usage depends on the number of URLs to index/crawl.

3. Also depends on how often content changes on your site (new pages added)
Re: About Crawl and Cron Job
« Reply #3 on: January 17, 2013, 02:19:48 PM »
Then it means if i have site large number of links it crawl again and consume resources and bandwidth?
Actually reason to ask this becoz when i first time run this script it consume lots of  resources and lead to suspend my hosting account.
It will be good if it remember previous indexed pages or either skip them to crawl again. Is this possible?
Re: About Crawl and Cron Job
« Reply #4 on: January 18, 2013, 08:23:54 PM »
Hello,

yes, it re-crawls since links to new pages could be anywhere on the site.