Imagine it is the night of the 27th of this month. You are deep asleep and there is a call. All the servers are down and you need to restore them. You do a full backup of your servers on the first Saturday of the month and differential backups on the subsequent Saturdays. During the week, you do incremental backups on each weekday. While still groggly from sleep, you got to figure out, in addition to the full backup, which differential and incremental backups you need to restore, which media they are on and what is the sequence to restore them. This is certainly not an easy thing to do when you are trying to sweep the cobwebs from your mind in the middle of the night. It is even worse, if you are the stand-in for the backup administrator who is conveniently on vacation.
Take a look at this calendar and see whether you can work out the sequence.
(The answer is at the end of this article.)
Contrast this with the situation if only full backups are run everyday. All you need is the last full backup.
Often, in designing backup schemes, the complexity of a restore is often ignored. This is because you don't often do a restore. Since you have to deal with backups everyday, you try to speed up the backups as much as possible by reducing the amount of data to be backed up. However, there is no such thing as a free lunch. What you save up-front, you end up paying more in the end. Take the scheme above, with the numerous media to restore, this will definitely mean that the time to restore the servers will increase. You might not be able to meet your restore time objective. Furthermore, it is no fun trying to do a complex restore when the whole organisation is calling you every few minutes to ask when the servers will be up.
As a side-note, BE 2012 does not allow you to mix differential and incremental backups in the same scheduled backup job. You will get this error message.
When you design your backup scheme, start with the simplest, i.e. just full backups. Full backups are the easiest to restore. You only need the last full backup. If your backup window and storage allows it, just use full backup. However, this scheme may not be possible due to the sheer volume of data to be backed up. Doing a full backup may exceed the backup window during week-days.
If a full backup is not feasible during week-days, try doing a full backup during the week-end and differential backups during week-days. A differential backup will backup all the files modified or added since the last full backup. As you move through the week, the amount of data that is backed up by the differential backup will increase because the changes accumulates. For example, if you do full backups on Sundays and differential backups on weekdays, your Wednesday's differential backup will contain all the files added/modified since the full backup on Sunday.
If your data is very volatile, your differential backup might back as much data as the full backup. Hence, there is a limit to how many differential backups you can do before you need to do a full backup. The advantage of doing differential backups is that you only need the LAST differential backup and the full backup to do a full restore. You don't need any previous differential backups.
You may find that doing differential backup may not result in a lot of time saving. This is especially so when you are using the archive bit method and have a lot of small files. This is because each file would have to be examined to determine whether the archive bit is on or off. This processing can slow down your job. This is akin to the situation where you have to transfer a mixture of blue and red balls from one container to another. If you can transfer all the balls (full backup) to the other container, this will be a much faster process than transfering only the red balls (differential backup) to the other container. You can improve the processing if you use the modified time method and use the USN journal. The USN journal keeps the changes to the files and is easier to process. What is said in this paragraph is equally applicable to incremental backups which we will cover next.
To avoid ballooning differential backups, you can do a full backup followed by incremental backups. Incremental backups will backup files added or modified since the last full backup or incremental backup. If the volume of changes to the files is constant then the size of the incremental backup will be constant. Often, this will tempt users to do avoid doing a full backup as long as they can. However, the pay-back comes when you need to do a restore. You would need the last full backup plus ALL the incremental backups since the full backups. Also, these incremental backups will have to be restored in sequence, i.e. the first incremental backup first, followed by the second incremental backup and so on. If the full backup is only done once at the beginning of the month, then by the end of the month there will be around 20 incremental backups to restore along with the full backup. When you restore incremental backups, you will get back all the deleted files in the resource. The longer the gap between full backups, the more of these deleted files you would see.
This chore of restoring incremental backups is mitigated if you use BE 2012 with its SImplified Restore Workflow. When you select a particular incremental backup to restore, BE 2012 would gather all the necessary incremental backups and the full backup to restore the server to that particular point in time. See the blog below
You can also use True Image Restore which is part of the payable ADBO option to help in restoring incremental backups. True Image restore will also gather and restore the full backup and incremental backups to get the resource back to the status at the chosen point in time.
However much these options can help with the restore of incremental backup. The time needed to restore all the incremental backups will still be there. Hence it is imperative that you do full backups as soon as it is feasible to avoid having to handle may incremental backups when you need to restore.
The backup scheme outlined in the first paragraph is an attempt to reduce the number of incremental backups and avoid doing a full backup for as long as possible. As you can see, the penalty is a complex restore.
You may ask: What about deduplication? How does it figure in any backup scheme?
Actually, deduplication should not be a consideration in designing a backup scheme. You still go through the same considerations that was discussed earlier. Deduplication does not mitigate any drawbacks of any of the schemes. It only reduces the amount of data that is stored. For example, if you do two full backups, the second full backup will only store the changed data blocks. In a sense, this is like doing an incremental backup although it is a full backup.
Deduplication presents its own problems. The backup may take longer because of the need to deduplicate the data and restore will take longer because of the need to hydrate the data.
In designing a backup scheme, the objective should always be how easy it is to do the restore and not how easy it is to do the backups.
Answer to the restore problem.
You need to restore the full backup (circled in blue), then the differential backup (circled in green) then the incremental backup (circled in red) taken on the 25th, followed by the incremental backup taken on the 26th.