Use of Tar

This is a post that I wrote a long time ago and I noticed is the statistics of my site that was being accesed, so I rewrote it and updated it.

Introduction

Tar (Tape archive) refers to 2 different things. One is a format that was developed a long time ago, in the beggining of UNIX with the main purpose to make easier to backup data using tapes. To accomplish this, it gathers many files and packs them in a single file, preserving the directory structure and permissions among other stuff.
Another, very different is the program that is used to work with files in tar format.
Nowadays, its functionality has extended beyond the tape backup and has added compression (bzip, gzip, etc) or is also able to redirect its output to other programs, devices, files, etc.
A very common and popular example of its use is that basically most of the source code that is distributed is in a .tar.gz, .tar.bzip, .tar.xz, etc; format.

Tar is also installed by default in most (if not all) of the Linux distros.

Compression

Tar is not limited to pack files but also can apply several compression methods. Among the types of files that tar can handle we can find:

  • .tar:  no compression, just packs everything in a single file.
  • .tar.gz:  file compressed using gzip.
  • .tar.bz2: file compressed using bzip2
  • .tar.lzma: file compressed using lzma

To select the compression method you want to use, you must pass a parameter that indicates it to the program. It is important to note that to compress or decompress, you must have installed the proper libraries beforehand. The comparison between the different compression methods goes beyond the scope of this post but I will try to explain it some time in the future.

Creating a file

If you check the manual of tar (man tar in the command line), you can obtain a list of all the parameters that can be used in the command line.
An online version can be found in  http://www.openbsd.org/cgi-bin/man.cgi?query=tar (in english), however, the most commonly used for file creation are:

  • -c : creates a file
  • -r : adds file to an already existing .tar, however, it doesn’t work for compressed files.
  • -f : specifies the name of the output file.
  • -w : allows the renaming of the files interactively, if you wish that they have a different name in the output file.
  • -j : compression using bzip2.
  • -z : compression using gzip
  • -Z :  compression using compress
  • –lzma : compression using lzma
  • –lzop : compression using lzop
  • -v : shows what is being processed in the console.

The basic syntax of tar is:

tar (options) (name_of_output_file) (name_of_files_to_be_packed)

As an example, if we have 2 files, lets say 1.gif and 2.gif, and a folder name pics, we can create a file as follows.

tar cf images.tar 1.gif 2.gif pics

The result is images.tar with the files and folders inside it.
If we want to add a file, lets say 3.gif to images.tar, we can use this

tar rf images.tar 3.gif

To add compression, for example gzip we use the respective parameter for it, z in this case.

tar cfz images.tar.gz 1.gif 2.gif fotos

We can also use wildcards that save a lot of time when we deal with lots of files. If we want to create a file that is compressed using bzip2 and that only contains .gif files, we run the following command

tar -jcf images.tar *.gif

We can even fuse (or concatenate) .tar files to avoid unpacking and repacking. For example if we have 2 files 1.tar and 2.tar we can join them like this.

tar –concatenate -f 1.tar 2.tar

In this way, 1.tar will contain the content of 2.tar.

Extracting a file

The parameters used to extract a file are the following:

  • -t : lists the content of a file.
  • -x : extracts a file.
  • -v : outputs the details of that is being processed.

So, for example, if we want to extract the contents of 1.tar inside our current directory we use:

tar xf 1.tar

If we deal with compressed files, theoretically we can use the last command, however, we can specify the compression algorithm to be sure that the correct option will be used. For example for a 1.tar.gz file we run.

tar -xzf 1.tar.gz

To extract some specific files from the .tar files we can select them by adding them at the end of the command. In this case we will extract readme.txt

tar xzf 1.tar readme.txt

To list the files instead of extracting them we use -t:

tar tf 1.tar

The -v option  (when used for creating and extracting commands), outputs all the files that are processed, paths, and errors that may occur, so if we run:

tar xvzf 1.tar.gz

The output will be

1.gif
2.gif
3.gif
/fotos/1.jpg

An interesting option is -C, it allows us to choose an output directory for the extracted files which saves us the time of explicitly copying/moving the files. A practical example is when you want to compile a new kernel and have to extract its source code. The source code of the 4.0 linux kernel is linux-4.0.tar.xz (at the time of writing this post). If we want to extract it into /usr/src, we run

tar -C /usr/src -xJf linux-4.0.tar.bz2

Deleting Files

We can delete files inside a tar file (uncompressed), using –delete. For example if we want to delete 1.gif from the images.tar file we can execute the following:

tar –delete -f imagenes.tar 1.gif

 

I hope this brief post about use of tar is helpful to all of you who have to work with this tool on a daily basis. I will try to keep the list updated and write about file compression.