Posted by & filed under Linux, OS X.

I use the following command to recursively download a bunch of files from a website to my local machine. It is great for working with open directories of files, e.g. those made available from the Apache web server.

The following can be added to your .bash_profile or .bashrc script, depending on which your OS/distro recommends:

function download-web() {
    wget -r -nH --no-parent --reject='index.html*' "$@" ;
}

To invoke the command, you run it like so:

download-web http://www.example.com/path/to/files

It will then download everything linked from the first page, checking each child path, to the current directory. It will not download anything above that directory, and will not keep a local copy of those index.html files (or index.html?blah=blah which get pretty annoying).

This isn’t a simple alias, but is a bash function, so that you can add a URL after the command. It should work fine in both OS X and Linux. If you are using OS X, you can follow my guide for Installing WGET on OS X.

Thomas Hunter II

Thomas is passionate about technology and building products. A web design business created while attending college slowly evolved into a brick and mortar on Main St. of his small Midwestern hometown. His desire for fame and fortune led to the co-founding of a Y Combinator startup and a life in California.

Tags:

Hey there! I'm currently writing a book on Microservices which I expect to release in early 2017. If you're interested in getting updates please signup here. More info about the Book