New user questions
« on: February 25, 2015, 02:34:05 PM »
Got the software installed and running.  Have a few questions about configuring it:

Exclusions-
 We'd like to exclude the directory /faculty/, and everything in it.
 What's the best pattern to exclude?
 Using 'faculty/' might be too greedy, as that would also exclude '/english/faculty/' (?)
 If we'd like to exclude everything in '/documents/' except '/documents/public/', how would we configure this?

Individual attributes-
 Is it possible to give specific pages a different priority and update frequency, while still having the generator use the server's response for the modification time?

News Sitemap-
Is the behavior of the news sitemap documented anywhere?  i.e. what's the default filter to find news items?
If we'd like to give specific news urls a different priority, how would we do this?

Thanks!


Re: New user questions
« Reply #1 on: February 26, 2015, 05:41:11 AM »
Hello,

1. excluding URLs *starting* with "faculty/" would be:
^faculty/*

to have an exception for documents/public you can use this:
documents/[^p].*

2. You can skip "lastmod" part of the setting:
page.php,,monthly,0.9

(notice 2 commas before the update frequency part)

3. New sitemap includes URLs from last 2 days.
In case if you have "Automatic priority" setting enabled generator will assign higher priority to new URLs.
Individual priority can be assigned based on the URL only.
Re: New user questions
« Reply #2 on: March 09, 2015, 04:33:34 PM »
Thanks for the info!  I have a few follow up questions:

Can we change the way the news_sitemap works?  i.e.
Instead of all pages younger than 2 days, can we just have it include all pages in /news/ (for example)?

Re: New user questions
« Reply #3 on: March 10, 2015, 06:23:43 AM »
Hello,

no, news sitemap is created according with google guidelines:
[ External links are visible to logged in users only ]
Re: New user questions
« Reply #4 on: March 10, 2015, 05:59:01 PM »
Another question :P
Most (all but 11) of the urls the generate finds on our site have a modified date of the request date.  (because they're php files maybe?)

what can we do to correct this?  Is apache configured incorrectly maybe?

Re: New user questions
« Reply #5 on: March 11, 2015, 05:06:38 PM »
You would need to modify your website backend (php script) to include http Last-modified header for each page.