Hi,
I am not sure if this is the right section of the forum to ask this question. If not please move it to the right place.
The question i had was for the below fields-
Do not parse URLs: - do not fetch pages that contain these substrings in URL (these URLs will still be added to sitemap!) & Exclude URLs: - do not include URLs that contain these substrings, one string per line
I have hundreds of URLs such as below
introduce_yourself/1249-the_entry_i_meanthe_introduction.html?p=2263
introduce_yourself/1250-hi_therel.html?p=2264
introduce_yourself/1251-hello_I_am_amy.html?p=2263
These above urls are same as
introduce_yourself/1249-the_entry_i_meanthe_introduction.html
introduce_yourself/1250-hi_therel.html
introduce_yourself/1251-hello_I_am_amy.html
So i want to exclude all urls from the sitemap which has .html?p= but let the html pages stay ... how do i do this?
a good example from a site being created currently would be
- <url>
<loc>MYDOMAIN/tech_forum/23902-mb_on_new_servers_your_inputs_needed.html</loc>
<priority>0.5</priority>
<lastmod>2007-02-25T22:29:57+00:00</lastmod>
<changefreq>weekly</changefreq>
</url>
- <url>
<loc>MYDOMAIN/tech_forum/23902-mb_on_new_servers_your_inputs_needed.html?p=73374</loc>
<priority>0.5</priority>
<lastmod>2007-02-25T22:29:57+00:00</lastmod>
<changefreq>weekly</changefreq>
</url>
- <url>
<loc>MYDOMAIND/general_discussion/23714-congrats_mrgovardhan_vt.html</loc>
<priority>0.5</priority>
<lastmod>2007-02-25T22:29:57+00:00</lastmod>
<changefreq>weekly</changefreq>
</url>
- <url>
<loc>MTDOMAIND/general_discussion/23714-congrats_mrgovardhan_vt.html?p=73344</loc>
<priority>0.5</priority>
<lastmod>2007-02-25T22:29:57+00:00</lastmod>
<changefreq>weekly</changefreq>
</url>
As you will see both are the same urls that being added to the sitemap as two different urls. the only difference being... .html?p= but both lead to the same place. So i want to exclude all urls from the sitemap which has .html?p= but let the html pages stay ... how do i do this?
Any help would be appreciated.
Thanks