Flintstone

*
PDF Files missing from sitemap
« on: August 30, 2005, 01:06:42 PM »
Hi All,

I have noticed that the Standalone Sitemap Generator crawls my sites just perfectly except for 1 problem. My links to PDF docs and RTF docs are not included in the sitemaps. Google can access PDF, RTF, DOC etc so they should surely be included in the sitemaps.

The HTML documents on the same page ae all listed just fine.

Is this a configuration option that I have not yet seen?
« Last Edit: August 30, 2005, 03:30:50 PM by Flintstone »
Re: PDF Files missing from sitemap
« Reply #1 on: August 30, 2005, 01:29:54 PM »
Hi,

please make sure you have these extensions in the "Do not parse extensions:" field on configuration page. In this way, the pages of these types will be included into sitemap, but will not be retrieved from the server because they are not html documents.
If you still have the problem with this option set, please PM me the link to your generator instance.

Flintstone

*
Re: PDF Files missing from sitemap
« Reply #2 on: August 30, 2005, 03:30:29 PM »
Yes, that fixed it ... I am surprised that these filetypes need to be added here to be included in the sitemap but can see the logic as this allows other filetypess to be listed in the XML index.

This is useful as I created a php script to parse the XML and dynamically create an php sitemap (I am planning on it creating static HTML files in the future) for other search engines to follow (25 links per page). It is slow as it has to open every page in order to grab the page title but it does appear to work ... if only I knew of an app that was already opening every page of my site that could also grab the titles and create the html sitemap .. maybe the developer could integrate this as a feature ;)

You do a great job with this excellent application ... thanks for todays update.

Mike Ratcliffe
Re: PDF Files missing from sitemap
« Reply #3 on: August 30, 2005, 06:07:12 PM »
Great, Mike! Thanks for your words.
May be at some point the plain html sitemap generation will be added as a feature to this xml generator. :)