Sitemap - Large Site - Old Content and Daily Updated Content
« on: January 01, 2012, 10:25:18 PM »
Hello Experts,

We have build a fairly large Entertainment site. We do load content every day around 100 articles.
On the other hand, we don't update most of the articles of which we have posted.

What sort of settings do we need to set for this type of requirement in order to generate/update the sitemap.xml file on a daily basis?

Thank you,
MovieIMAX
[ External links are visible to forum administrators only ]



Re: Sitemap - Large Site - Old Content and Daily Updated Content
« Reply #1 on: January 02, 2012, 09:41:01 AM »
Hello,

default settings can be used. Sitemap generator will have to re-crawl the whole site anyway since new pages can be linked from any page.
Re: Sitemap - Large Site - Old Content and Daily Updated Content
« Reply #2 on: January 02, 2012, 03:40:48 PM »
Hello Support,

That's correct. A new page can be linked from any page.

Which means, the crawling will skip the old pages going thru them or regardless of it will visit, scan them as well.

I am posting this question because, the site has more than 300K pages and I am sure visiting/crawling each and every page again and again every day will take a huge time to build the sitemap.xml files...

Thank you,
MovieIMAX
[ External links are visible to forum administrators only ]
Re: Sitemap - Large Site - Old Content and Daily Updated Content
« Reply #3 on: January 03, 2012, 10:37:52 AM »
That's right, all pages are recrawled. In case if your site is organized in  a way where all new pages are linked from one place (like "news page"), you can create a full sitemap one time, save it with a different filename (sitemapfull.xml for instance), and then specify that news page URL as Starting URl for generator and limit "depth level" to "1" or "2", so it won't go very deep recrawling all pages, but will find new pages for a separate sitemap.