Search and Replace in a MySQL Database after Moving a Site with WGET

Sometimes you need to move your old website off of a CMS, or at least archive it, and the only way is to use WGET to mirror the website. Wget downloads entire websites, turning dynamic sites into static sites. The following command would download the site http://www.theoldsite.net/mypath…

wget -H -Dimghost.com,theoldsite.net '--restrict-file-name=windows' -A gif,jpg,html,tcl -np --convert-links --html-extension -rx http://www.theoldsite.net/mypath/index.html

That would download the gif, jpg, html, and tcl files, from both imghost.com and theoldsite.net, and make the URLs into Windows-compatible file names ending in “.html”, and converting links into relative links, so the output folder could be moved.

That’s all fine if you just want to link out to the site, but if you link to specific pages within the site, you now have to fix all the URLs. If you’re using a database, and these URLs are in a table, it’s not difficult to fix:

update stories set link=replace(link,'http://www.theoldsite.net/mypath/','/old-site/www.theoldsite.net/mypath/') WHERE tags like '%relevant%';

update stories set link=replace(link,'?','@') WHERE tags like '%relevant%';

update stories set link=concat(link,'.html') WHERE tags like '%relevant%';

Don’t run those commands as-is. You have to alter them to work with your URLs. The basic idea is to replace the left part of the URL with your new site’s URL, and then replace weird characters within the URL, and append ‘.html’ to the URL.