What makes a good backup solution? (2 fo 4)

What makes a good backup solution?
Part 2 of 4
Backup Types

If you have not read part one you may want to start there.

In this section we will look at the types of backups that are available in a modern environment. First we will start with a review of the 4 types of backups that most traditional backup systems can do. Then we will move on to newer techniques and technologies.

The 4 classic backup types:
Full: A full backup is the simplest backup and it’s what it sounds like. Anything it the backup set (drive, folder, system etc) is backed up. The backup resets (clears) archive bits as it goes making it possible for a future backup to tell that these files have already been backed up.
Differential: A differential backup backs up all files that do not have the archive bit set but it leaves the archive bit alone. This means that it will backup any files that have changed since the last backup that reset the archive bit was done.
Incremental: An Incremental backup backs up all files that do not have the archive bit set and then it resets the archive bit.
Copy: A copy backup typically ignores the archive bit entirely and just bakes a copy of all the data in the backup set. Copies are not usually used in normal backup operations but they are useful for creating one off backup set for archiving or system migrations without interrupting the archive bit used by other backups.

Note: At least Backup Exec is capable of using timestamps instead or archive bit for differential and incremental. I can’t see a reason that this would be advantageous and I can see a number of reasons it might cause problems so I will ignore this option.

Examples:
Full plus differential: In this scenario a full backup is done at some interval usually weekly or occasionally monthly and all backups after that are differential. The advantage of this backup scenarios is that it is much quicker than doing a full backup each time but you still only need two backup sets to do a full recovery (the full backup plus the last differential). The disadvantage is that over time the differentials get larger since a file changed after the full backup will continue to be backed up over and over until a new full backup is run.
Full plus incremental: In this scenario a full backup is done at some interval usually weekly and all backups after that are incremental. The advantage of this backup scenarios is that it is much quicker than doing a full backup each time or doing differentials after the first backup. The subsequent backup sets are also much smaller than differential backup sets. The disadvantage is that you need every backup set since the full backup was done to recover a system. If any backup set is missing or corrupt there will be data of any changes that happened that day. Recovering a system using incremental backups can be very time consuming because of the large number of sequential recovery operations that must be completed.

Note: For the most part if you have the backup window and enough backup media doing a full backup each time is easiest. If you run into backup window issues or start using to much backup media then weekly full backups (usually on the weekend) with differential or incremental backups for the rest of the week are an option. I personally really dislike incremental backups because they are such a pain to use for recovery but sometimes that’s the only option.

Newer technologies for backups:
The backup technology described above has been around for years. It’s tries and true and very likely will be some part of your backup solution. Looking forward I want to look at a few technologies that can be layered into a complete backup solution.

Shadow Copy Snapshots (Windows 2003 and newer): Starting with Windows 2003 Microsoft added a technology called Volume Shadow Copy (VSS) to Windows. One of the technologies made possible by VSS is Shadow Copy. Shadow Copy allows you to take snapshots of file shares without the need for any special hardware. As I mentioned before most recoveries are recoveries of accidentally deleted or corrupted files. Using Shadow copy you can allocate a portion of your online disk space to take an image of files that have changed on a regular basis. Since this is live it’s a simple point and click process to recover files. If you want you can even allow end users to recover their own files. Like all backups the trick of properly using Shadow Copy is to allocate enough space for backups without wasting space as well as making sure that you take backups frequently enough to be useful without causing undue stress on the file server in question.

Snapshots (SAN based): If you use a SAN for your storage in all likelihood you have the ability to take snapshots or can add snapshot to your SAN as a software upgrade. SAN snapshots essentially take a point in time image of your file system allowing for recovery or file or entire disk systems.
Snapshots can be very useful as part of a backup solutions especially if you are running into backup time window issues. You can take a snapshot of the data which is almost instantaneous and then back the snapshot up rather than locking your live data files during the backup.
There are a few things to keep in mind when using snapshots.
By themselves snapshots do not protect you from hardware failure.
Most SANs are not application aware so backups will be “crash consistent” which is a nice way of saying that the files will be bit for bit as they were at the moment the snapshot was taken much like if the server had had a power failure at that moment. When recovering you may have to do some cleanup to get applications (SQL, e-mail etc) back to a consistent state.

Copying files between systems: As a rudimentary system of backup many people copy files between systems. This can be effective as part of a backup plan but is sorely lacking as a primary backup solution. The problems include usually only having one point in time. Intentionally or accidentally deleted or corrupted files will be deleted of corrupted in the backup as soon as a backup happens. If the systems are not geographically dispersed you don’t have any protection from disasters affecting the entire data center. The advantage of this type of backup is that it is usually quick to recover files as long as the recovery is found within the available recovery period.

System Image backups: For recovery of complete systems having a system image is the ideal solution. This can be done in a number of different way depending on what you are backing up.
Starting with Windows 2008 and Vista Microsoft’s build in backup program no longer backs up select files. Backups are now done as disk images. The disk images are actually in the same VHD format used by Hyper-V, Virtual Server, and Virtual PC so testing the backups are very easy. The nice thing about this is that it make system recovery very easy. This is even better if your disaster recovery solution relies on a virtualized warn or hot location. The bad thing is that it’s not ideal for recovery of single files. Under Windows 7 and Windows Server 2008 R2 you will be able to natively mount the disks for recovery. There is also a piece of third party software called WinImage that can open and edit VHD files for pulling out single files.
Norton Ghost can be used for making point in time backups but I’m not sure that I would want to use it as an ongoing solution. I have in the past used Ghost before and after major application upgrades on servers to give me a fall back. Once I was sure the application was working as expected I would archive the backup to give me somewhere to start my recovery from without having to manually install and configure applications.
Some commercial backup solutions have a bare metal recovery option. I have never used one of these solutions but it seems like a good idea.
Under Linux, *BSD and I assume other UNIX like operating systems dd and be used to take a disk image for backup.


Finally let’s look at a what gets backed up:
Everything: This is the simplest and safest solution. If it’s on the disk it gets backed up. The real downside to this is that you will end up backing up a lot of system files that really are not needed.
Select: Only the files that are explicitly chosen are backed up. This is more economical of storage media but you do run the risk of forgetting to add new paths that need to be backed up.
As I mentioned earlier if you use the built in backup for Windows 2008 and Vista a complete drive level backup is the only option now. 

All four parts of this article: one, two, three and four

Minor edit 5/20/2009