Novell Cool Solutions

Non-Interactive Download Management with Wget



By:

July 14, 2009 12:51 pm

Reads:5,286

Comments:0

Score:5

Print/PDF

Wget is a GNU utility to download the files from the Web. By default every distribution provides this command line utility. The beauty of the Wget is one can download the entire contents of any website and can view it in offline mode. Protocols supported by Wget are HTTP,HTTPS and FTP. Wget also supports retrieval through proxies.

Most of the Web browsers require constant user presence while downloading, in contrast Wget doesn’t require that interaction, that’s why this is called non-interactive down loader. This allows you to start a retrieval and disconnect from the system, letting Wget finish the work.

In this article I am going to explain the basic usage of the Wget and other important options available with it. Lets start with basic example. ‘wget’ followed by website name only downloads the index.html of the website.

#wget www.novell.com

If you want to download the entire contents of the website use the ‘-r’ option along with Wget. It fully recreates the directory structure of the original site locally.

#wget -r www.novell.com
-l depth
--level=depth

Specify recursion maximum depth level depth. The default maximum depth is 5.

If you are behind a proxy server just set the HTTP_PROXY environment variable to proxy IP, then each request pass to the proxy server.

#export HTTP_PROXY=192.168.1.1
#wget -r www.novell.com –proxy-user=user –proxy-password=password

Specify the username ‘user’ and password ‘password’ for authentication on a proxy server. Wget will encode them using the “basic” authentication scheme.

--no-proxy

Don’t use proxies, even if the appropriate *_proxy environment variable is defined.

--secure-protocol=protocol

Choose the secure protocol to be used. Legal values are auto, SSLv2, SSLv3, and TLSv1.

--no-http-keep-alive

Turn off the “keep-alive” feature for HTTP downloads. By default Wget uses persistent connections(i.e one TCP connection for all the requests). With this option instruct Wget to use new TCP connection for each request.

--delete-after

This option tells Wget to delete every single file it downloads, after having done so. It is useful for pre-fetching popular pages through a proxy.

-t number
--tries=number

Set number of retries to number. Specify 0 or inf for infinite retrying. The default is to retry 20 times.

--limit-rate=amount

Limit the download speed to amount bytes per second. Amount may be expressed in bytes, kilobytes with the k suffix, or megabytes with the m suffix. For example, –limit-rate=20k will limit the retrieval rate to 20KB/s.

FILES

/usr/local/etc/wgetrc

Default location of the global startup file.

.wgetrc

User startup file.

Source: Wget man page.

1 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 5 (1 votes, average: 5.00 out of 5)
You need to be a registered member to rate this post.
Loading...Loading...

Categories: Uncategorized

0

Disclaimer: This content is not supported by Novell. It was contributed by a community member and is published "as is." It seems to have worked for at least one person, and might work for you. But please be sure to test it thoroughly before using it in a production environment.

Comment

RSS