Introduction to Unix

../_images/ksus1.jpg ../_images/UNIX_blocks.jpeg

3.6. Archiving Data

Linux for Programmers and Users, Section 4.6.

3.6.1. cpio

cpio

copy files to and from archives

SYNOPSIS

cpio {-o|–create} [OPTIONS] < name-list [> archive]

cpio {-i|–extract} [OPTIONS] [pattern…] [< archive]

cpio {-p|–pass-through} [OPTIONS] destination-directory < name-list

DESCRIPTION

cpio copies files into or out of a cpio or tar archive, which is a file that contains other files plus information about them, such as their file name, owner, timestamps, and access permissions. The archive can be another file on the disk, a magnetic tape, or a pipe. cpio has three operating modes.

In copy-out mode, cpio copies files into an archive. It reads a list of filenames, one per line, on the standard input, and writes the archive onto the standard output. A typical way to generate the list of filenames is with the find command; you should give find the find -depth option to minimize problems with permissions on directories.

In copy-in mode, cpio copies files out of an archive or lists the archive contents. It reads the archive from the standard input. Any non-option command line arguments are file name search patterns; only files in the archive whose names match one or more of those patterns are copied from the archive. If no patterns are given, all files are extracted.

In copy-pass mode, cpio copies files from one directory tree to another, combining the copy-out and copy-in steps without actually using an archive. It reads the list of files to copy from the standard input; the directory into which it will copy them is given as a non-option argument.

Note

Prior to the cp -p option to the cp command, cpio -p in combination with find -depth was believed to be the only safe means of copying whole directory trees and not changing any file ownership, permission or timestamp properties of the files. (It may still be?) Use cd to go the source directory and then type:

find . -depth -print | cpio -pdm destination-directory

OPTIONS

-a, --reset-access-time

Reset the access times of files after reading them, so that it does not look like they have just been read.

-A, --append

Append to an existing archive. Only works in copy-out mode. The archive must be a disk file specified with the -O or -F (–file) option.

-b, --swap

In copy-in mode, swap both halfwords of words and bytes of halfwords in the data. Equivalent to -sS. Use this option to convert 32-bit integers between big-endian and little-endian machines.

-d, --make-directories

Create leading directories where needed.

-i, --extract

Run in copy-in mode.

-l, --link

Link files instead of copying them, when possible.

-m, --preserve-modification-time

Retain previous file modification times when creating files.

-o, --create

Run in copy-out mode.

-p, --pass-through

Run in copy-pass mode.

--quiet

Do not print the number of blocks copied.

-t, --list

Print a table of contents of the input.

-v, --verbose

List the files processed, or with -t, give an ls -l style table of contents listing.

-V --dot

Print a “.” for each file processed.

3.6.2. tar

tar

The GNU version of the tar archiving utility

SYNOPSIS
tar <operation> [options]

DESCRIPTION

The tar command, an archiving program designed to store and extract files from an archive file known as a tarfile. A tarfile may be made on a tape drive, however, it is also common to write a tarfile to a normal file. The first argument to tar must be one of the options Acdrtux, followed by any optional functions. The final arguments to tar are the names of the files or directories that should be archived.

OPERATIONS

One of the following options must be used:

-A, --catenate, --concatenate

append tar files to an archive

-c, --create

create a new archive.

-d, --diff, --compare

find differences between archive and file system

-r, --append

append files to the end of an archive

-t, --list

list the contents of an archive

-u, --update

only append files that are newer than the existing in archive

-x, --extract, --get

extract files from an archive

--delete

delete from the archive (not for use on mag tapes!)

COMMON OPTIONS

-C, --directory DIR

change to directory DIR

-f, --file F

use archive file or device F (default “-“, meaning stdin/stdout)

-j, --bzip2

filter archive through bzip2, use to decompress .bz2 files

-p, --preserve-permissions

extract all protection information

-v, --verbose

verbosely list files processed

-z, --gzip, --ungzip

filter the archive through gzip

EXAMPLES

tar -xvf foo.tar

verbosely extract foo.tar

tar -xzf foo.tar.gz

extract gzipped foo.tar.gz

tar -cjf foo.tar.bz2 bar/

create bzipped tar archive of the directory bar called foo.tar.bz2

tar -xjf foo.tar.bz2 -C bar/

extract bzipped foo.tar.bz2 after changing directory to bar

tar -xzf foo.tar.gz blah.txt

extract the file blah.txt from foo.tar.bz2

3.6.3. dump

dump

file system backup

Note

Because of the sophistication of current backup hardware and the heterogeneous mix of systems used, most companies today use commercial backup software.

SYNOPSIS
dump [-level#] [OPTION] files-to-dump
DESCRIPTION

Dump examines files on an ext2/3 file system and determines which files need to be backed up. These files are copied to the given disk, tape or other storage medium for safe keeping.

The files-to-dump is either a mountpoint of a file system or a list of files and directories to be backed up.

OPTIONS

-level#

The dump level (any integer). A level 0, full backup, guarantees the entire file system is copied. A level number above 0, incremental backup, tells dump to copy all files new or modified since the last dump of a lower level.

-f file

Write the backup to file; file may be a special device file like /dev/st0 (a tape drive), an ordinary file, or - (the standard output).

-L label

The user-supplied text string label is placed into the dump header, where tools like restore and file

-u

Update the file /etc/dumpdates after a successful dump.

An efficient method of staggering incremental dumps to minimize the number of tapes follows:

  1. Always start with a level 0 backup, for example:

    /sbin/dump -0u -f /dev/st0 /usr/src
    

    This should be done at set intervals, say once a month or once every two months, and on a set of fresh tapes that is saved forever.

  2. Perform a level 1 dump on a weekly basis.

  3. Perform higher level dumps (2 - 9) on other days.

After several months or so, the daily and weekly tapes should get rotated out of the dump cycle and fresh tapes brought in.

3.6.4. restore

restore

Restore files or file systems from backups made with dump

SYNOPSIS

restore -C | -i | -r | -t | -x [OPTIONS]

DESCRIPTION

The restore command performs the inverse function of dump(8). A full backup of a file system may be restored and subsequent incremental backups layered on top of it. Single files and directory subtrees may be restored from full or partial backups.

Exactly one of the following flags is required:

-C

This mode allows comparison of files from a dump.

-i

This mode allows interactive restoration of files from a dump. After reading in the directory information from the dump, restore provides a shell like interface that allows the user to move around the directory tree selecting files to be extracted.

add [arg]

The current directory or specified argument is added to the list of files to be extracted. If a directory is specified, then it and all its descendants are added to the extraction list (unless the -h flag is specified on the command line). Files that are on the extraction list are prepended with a * when they are listed by ls.

cd arg

Change the current working directory to the specified argument.

delete [arg]

The current directory or specified argument is deleted from the list of files to be extracted. If a directory is specified, then it and all its descendants are deleted from the extraction list (unless the -h flag is specified on the command line). The most expedient way to extract most of the files from a directory is to add the directory to the extraction list and then delete those files that are not needed.

extract

All files on the extraction list are extracted from the dump. Restore will ask which volume the user wishes to mount. The fastest way to extract a few files is to start with the last volume and work towards the first volume.

help List a summary of the available commands.

ls [arg]

List the current or specified directory. Entries that are directories are appended with a “/”. Entries that have been marked for extraction are prepended with a “*”. If the verbose flag is set, the inode number of each entry is also listed.

pwd Print the full pathname of the current working directory.

quit Restore immediately exits, even if the extraction list
is not empty.

setmodes

All directories that have been added to the extraction list have their owner, modes, and times set; nothing is extracted from the dump. This is useful for cleaning up after a restore has been prematurely aborted.

verbose

The sense of the -v flag is toggled. When set, the verbose flag causes the ls command to list the inode numbers of all entries. It also causes restore to print out information about each file as it is extracted.
-r

Restore (rebuild) a file system.

-t

The names of the specified files are listed if they occur on the backup.

-x

The named files are read from the given media.

3.6.5. gzip

Linux for Programmers and Users, Section 4.12.1.

gzip, gunzip, zcat

compress or expand files

SYNOPSIS

gzip [ -acdfhlLnNrtvV19 ] [-S suffix] [ name … ]

gunzip [ -acfhlLnNrtvV ] [-S suffix] [ name … ]

zcat [ -fhLV ] [ name … ]

DESCRIPTION

Gzip reduces the size of the named files using Lempel-Ziv coding (LZ77). Whenever possible, each file is replaced by one with the extension .gz, while keeping the same ownership modes, access and modification times.

Note

In a strictly Unix environment, gzip might only be used for compression purposes. Programs such as tar or cpio would be used exclusively to create archive files. However, since zip and unzip utilities are better supported in Windows than tar and cpio, gzip and gunzip are often used to create and extract archives.

OPTIONS

-c –stdout –to-stdout

Write output on standard output; keep original files unchanged. (same as zcat)

-d –decompress –uncompress

Decompress. (same as gunzip)
-l –list

For each compressed file, list the following fields:

compressed size: size of the compressed file uncompressed size: size of the uncompressed file ratio: compression ratio (0.0% if unknown) uncompressed_name: name of the uncompressed file

In combination with the –verbose option, the following fields are also displayed:

method: compression method crc: the 32-bit CRC of the uncompressed data date & time: time stamp for the uncompressed file

The compression methods currently supported are deflate, compress, lzh (SCO compress -H) and pack. The crc is given as ffffffff for a file not in gzip format.

With –name, the uncompressed name, date and time are those stored within the compress file if present.

With –verbose, the size totals and compression ratio for all files is also displayed.

-r –recursive

Travel the directory structure recursively. If any of the file names specified on the command line are directories, gzip will descend into the directory and compress all the files it finds there (or decompress them in the case of gunzip ).
-v –verbose
Verbose. Display the name and percentage reduction for each file compressed or decompressed.

EXAMPLE

To create a zip file foo.zip from a directory named foo, do the following from parent directory of foo:

$ gzip -r foo foo

3.6.6. dd

dd

convert and copy a file at the block level.

Note

dd is not a command for created or extracting archives, but is included because it is not covered elsewhere and it is often used together with the commands for working with archives, especially when data is copied over the network as part of the operation.

SYNOPSIS

dd [OPTION]…

DESCRIPTION

bs=BYTES

force ibs=BYTES and obs=BYTES

count=BLOCKS

copy only BLOCKS input blocks

if=FILE

read from FILE instead of stdin

of=FILE

write to FILE instead of stdout

EXAMPLES

Write a compressed tar archive from the local machine to a tape drive on a remote machine:

$ tar -czf - <directory> | ssh remotehost dd of=/dev/rmt/0 bs=32k

Read a tar archive from a tape drive on remote machine and extract it locally:

$ ssh remotehost dd if=/dev/rmt/0 bs=32k | tar -xzf -

Create an file containing all zeros. This is useful as the starting point of creating a file which contains a file system image, which is sometimes needed when working with virtualization software.

dd if=/dev/zero of=file bs=1024k count=1024

3.6.7. mt

mt

Control a magnetic tape drive. Typical actions are to fast-forward to a position, rewind to a position or completely and eject the tape. See the manual page if more details are needed.