Shoutbox

Website extraction - Printable Version

-Shoutbox (https://shoutbox.menthix.net)
+-- Forum: MsgHelp Archive (/forumdisplay.php?fid=58)
+--- Forum: Skype & Technology (/forumdisplay.php?fid=9)
+---- Forum: Tech Talk (/forumdisplay.php?fid=17)
+----- Thread: Website extraction (/showthread.php?tid=92385)

Website extraction by SmokingCookie on 09-28-2009 at 07:22 PM

I'm wondering, is there some kind of tool that recovers the files from a website (preferably freeware)?

Like saving the index.html, favicon (okay, that isn't really hard, but still..) image files ans JS files


RE: Website extraction by Menthix on 09-28-2009 at 08:33 PM

quote:
Originally posted by toddy
yes HTH
Explain your posts or just don't bother posting.

The Derren Brown one might have been in T&T, but this one isn't.
RE: Website extraction by iKingsten on 09-28-2009 at 09:06 PM

eventually the downloadlink or name for/of this tool? ...


RE: Website extraction by Adeptus on 09-29-2009 at 01:11 PM

The best tool to use if you want to create a local copy of all or part of a website is wget.  It is command line based, so it takes some learning curve to get it to do exactly what you want: "wget -r -np http://somesite.com/somepage/" is a good starting point.  However, once you learn how to use it, it can do anything as far as this kind of stuff goes. 

I use this Windows build. It is not the latest version, but it works just fine.  If you use this package, you will want to edit the PATH environment variable and add ";%PROGRAMFILES%\GnuWin32\bin" at the end (";%PROGRAMFILES(X86)%\GnuWin32\bin" on 64-bit Windows).

Other than that, the DownThemAll Firefox extension works well for simpler tasks.  It won't recursively mirror a site, but it is fine for grabbing all the images on a page or all linked files of a certain type.


RE: Website extraction by SmokingCookie on 09-29-2009 at 04:14 PM

Okay, I downloaded wget, DownloadThemAll and a bunch of other Firefox add-ons/extensions/whatever and it works file, but the script files I'm looking for aren't there. I guess I'll have to look for something else.