Posted by & filed under Linux, OS X.

I use the following command to recursively download a bunch of files from a website to my local machine. It is great for working with open directories of files, e.g. those made available from the Apache web server.

The following can be added to your .bash_profile or .bashrc script, depending on which your OS/distro recommends:

function download-web() {
    wget -r -nH --no-parent --reject='index.html*' "$@" ;
}

To invoke the command, you run it like so:

download-web http://www.example.com/path/to/files

It will then download everything linked from the first page, checking each child path, to the current directory. It will not download anything above that directory, and will not keep a local copy of those index.html files (or index.html?blah=blah which get pretty annoying).

This isn’t a simple alias, but is a bash function, so that you can add a URL after the command. It should work fine in both OS X and Linux. If you are using OS X, you can follow my guide for Installing WGET on OS X.

Thomas Hunter II

Thomas is the author of Advanced Microservices and a prolific public speaker with a passion for reducing complex problems into simple language and diagrams. His career includes working at Fortune 50's in the Midwest, co-founding a successful startup, and everything in between.

Tags: