Home page of this blog

Wednesday, September 23, 2009

Using fast concurrent command line download manager in Ubuntu

aria2 is a very powerful command line download manager. I use aria2 to download linux iso's. Here is an example how I used aria2 in Ubuntu to download Ubuntu Netbook Remix.
Step 1: Installing aria2 in Ubuntu Jaunty
This is very simple.
sudo apt-get install aria2
Step 2: Identify mirrorlist URL for the download
We need to identify mirrors all across the world, hosting our favorite download.
For example, Ubuntu Netbook Remix is hosted in different servers all across the world. The mirrors URL gives us easy access of all such servers hosting Ubuntu
http://www.ubuntu.com/getubuntu/downloadmirrors#mirrors
This needn't be Ubuntu iso, it can be any file which is mirrored/hosted in different servers across the world. This step just shows a URL which lists all such mirror servers hosting same version of file, here in this example Ubuntu Jaunty
Step 3: Pick 5 reliable nearby servers from the mirrorlist
In this step, we need to choose from the mirrorlist around 5 reliable servers.
The five servers should be different from one another
In this example, say I pick Japan, Finland, German, UK and Belgium servers
Step 4: Copy the download URL of the file we wish to download from each of the picked servers and note it
Here the URL we choose should be the complete download URL including the file name we intend to download.
For example, if I wanted to download Ubuntu Netbook Remix from the servers I chose in previous step (Japan, Finland, German, UK and Belgium), I need to note down the full path of the file I am downloading from each server. Let us say I noted down as follows in some editor
  1. Japan: http://ftp.riken.jp/Linux/ubuntu-cdimage/9.04/ubuntu-9.04-netbook-remix-i386.img
  2. Finland: http://www.nic.funet.fi/pub/mirrors/releases.ubuntu.com/9.04/ubuntu-9.04-netbook-remix-i386.img
  3. Germany: http://ftp.uni-kl.de/pub/linux/ubuntu.iso/9.04/ubuntu-9.04-netbook-remix-i386.img
  4. UK: http://mirror.ox.ac.uk/sites/releases.ubuntu.com/releases/9.04/ubuntu-9.04-netbook-remix-i386.img
  5. Belgium: http://ubuntu.mirrors.skynet.be/pub/ubuntu.com/releases/9.04/ubuntu-9.04-netbook-remix-i386.img
Step 5: Create a text file and store the 5 URL separated by tab
In this step, we need to open gedit or nano or vi editor and paste all the URL's we noted down each separated from one another by a tab
For example, in this case, I gave gedit urls.txt from command line and pasted the noted down URL's separated by tabs. It is very important that each URL should be separated from another by a tab and not a space
Example

http://ftp.riken.jp/Linux/ubuntu-cdimage/9.04/ubuntu-9.04-netbook-remix-i386.img http://www.nic.funet.fi/pub/mirrors/releases.ubuntu.com/9.04/ubuntu-9.04-netbook-remix-i386.img http://ftp.uni-kl.de/pub/linux/ubuntu.iso/9.04/ubuntu-9.04-netbook-remix-i386.img http://mirror.ox.ac.uk/sites/releases.ubuntu.com/releases/9.04/ubuntu-9.04-netbook-remix-i386.img http://ubuntu.mirrors.skynet.be/pub/ubuntu.com/releases/9.04/ubuntu-9.04-netbook-remix-i386.img

Save it and close the editor
Step 6: Start downloading using aria2 from command line
This is our final step. After saving the tab separated list of URL's in a text file, we need to pass that file as input to aria2
Open terminal (Key in Alt + F2 and type gnome-terminal in the text box to launch terminal)
Change to the directory where the tab separated URL file is saved. Let us say I saved urls.txt in Downloads folder then I type the following to change directory
cd ~/Downloads
Then use aria2c as follows to start downloading
aria2c -m0 -iurls.txt -c
Well that is it, do not close the terminal till the download completes. aria2c allocates the total size of the download first, then hits all the fully qualified URL's from the input file.
aria2c --> is the commandline aria2 download manager
-c --> this option asks aria2c to continue the download from where it left the download last time, we closed without completing download
-i --> the file which is passed next to -i should contain tab separated list of URL's from different mirrors. The default count of URL's expected by aria2c is 5 and aria2c opens 5 concurrent connections and hits each different server from different thread. If a server fails, that is peacefully ignored by aria2c. The concurrency count can be increased. Each conncurrent connection tries to fetch a segment of the big file from the serverlist thus reducing the bandwidth on server. It is like honeybee collecting nectar from 10 different flowers not stressing any of the flowers
-m --> tells how many times aria2c should retry failed connection, 0 means unlimited
There are other interesting params, which can be read from man aria2c
aria2c will utilize n concurrent threads and download the split segment of big file from n different URL's we gave inside the file. This way let us say I have a 2 Mbps download connection, aria2c will utilize 51 Kbps in each 5 concurrent thread equally and thus utilizing max download bandwidth totally.
Hope this helps, my next idea is to develop a QT based s/w for using aria2c from GUI. Maybe if I get it ready, I may publish that
Thanks for reading

No comments:

Post a Comment