Archives and Archiving Files and Documents

Archiving is different from backups. Think about them separately.

An archive is an organizational strategy for data. It's a structure into which data can be stored in a way that makes it easy to retrieve the data in the future.

There are a few different ways to organize information. To use some computer terms: "tables", "time", and "hierarchy".

Tables refers to database tables, where data is organized into records and fields (or rows and columns). A record is a unit of data, like a row in a list. A field is information about the data, or the data itself, like the columns in a row. The useful property of a table is that every row has the same columns, so you can sort and group by columns.


A hierarchy is like a filing system of folders.

Chronological organization is to organize information by time, so you can retrieve the data from a specific time period.

The computer's file system uses all three methods of organization. Each file has common fields, like the modification time, size, and usually a file extension.

The files are stored in a hierarchy, and people typically name the folders uniformly. This uniform naming breaks up the filename into fields, so it's easier to sort through the files.

For more info, see the file naming convention articles below.

The file system generally lacks the ability to add extra fields of data. For example, it would be useful to be able to attach major and minor version numbers to every file. While there are some ways to do this, there isn't a simple way that exposes itself through the user interface, easily.

Consequently, the folder hierarchy is usually used instead of extra fields. It's not a bad or good thing - it's just how we do it. For some examples of this, see the folder organizing articles below.

Good archiving can assist backups by breaking the file system into parts. For example, if the folders are organized by client, you can split up the backups by client. Then, you can direct archives for old clients onto specific media, which might be kept offline for offsite. With very little work, you can cut down the time required to backup adequately -- and that translates into a greater capacity for the entire backup system.