I placed the script on one of my servers for a very-very large website that I have (with lots of data). I ran the script and it took up my entire servers virtual memory and overloaded the server. I had to (it took time because the server had 99% of it's virtual memory full) kill the process and restart the server.
Keep in mind that I have around 40 million pages that need to be crawled. Is there anything in the setting where it allows me to specify the maximum amount of memory the script is allowed to use on the server.
How long do you think this would take to crawl and build a sitemap for 40,000,000 links/pages? Days? Weeks?