After unsuccessfully trying to get a sitemap generated for two weeks (the site is probably around 250,000 pages), I decided to start over and only generated the xml (disable text, ror.xml and html file generation). It seems that the crawl_dump.log file rolled over (it was larger when I went to bed, woke up this morning and it's smaller). I see the same session still active in the data directory. Is this normal or should I be concerned? If it's not normal, where should I look to get an idea of what happened? Also, I have this set to create 50,000 pages per file. Also, all the sitemap files (sitemap.xml, sitemap1.xml, sitemap2.xml, etc) are still zero length.
Before I went to bed:
-rw-rw-rw- 1 apache apache 42066118 Jun 13 20:38 crawl_dump.log
-rw-rw-rw- 1 apache apache 142 Jun 13 20:38 crawl_state.log
-rw-r--r-- 1 apache apache 3040 Jun 13 07:21 generator.conf
-rw-r--r-- 1 user_x group_x 11 Jun 3 14:14 placeholder.txt
-rw------- 1 apache apache 13 Jun 13 07:22 sess_76vijoaqu5s5r4d0hiem4c9jh6
-rw-rw-rw- 1 user_x group_x 0 Jun 3 06:26 sitemap.html
This morning:
-rw-rw-rw- 1 apache apache 23965561 Jun 14 03:24 crawl_dump.log
-rw-rw-rw- 1 apache apache 142 Jun 14 03:24 crawl_state.log
-rw-r--r-- 1 apache apache 3040 Jun 13 07:21 generator.conf
-rw-r--r-- 1 user_x group_x 11 Jun 3 14:14 placeholder.txt
-rw------- 1 apache apache 13 Jun 13 21:19 sess_76vijoaqu5s5r4d0hiem4c9jh6
-rw-rw-rw- 1 user_x group_x 0 Jun 3 06:26 sitemap.html
Thanks in advance...