The Technical Services team for Backup and Recovery have produced a number of documents we call "Blueprints".
These Blueprints are designed to show backup and recovery challenges around specific technologies or functions and how NetBackup solves these challenges.
Each Blueprint consists of:
A Media Server Deduplication Pool (MSDP) provides deduplication at the media server level by simply setting up disk using Advanced Client, configuring it as a Media Server Deduplication Pool which then performs deduplication as the data streams through the media server. This greatly reduces the amount of disk needed to perform backups when a customer is not interested in buying an appliance to perform deduplication. It should be noted that the NetBackup Appliances use MSDP to provide deduplication, however it is also available standalone with NetBackup Advanced Client. There are many benefits to using an Appliance for deduplication however those are not covered here. It should be noted that the changes made to MSDP are also included in the Appliance 2.6.1 build. The business value to a customer is better (reduced) disk usage as well as very easy set up and management of a deduplication pool.
MSDP using a deduplication hash algorithm to determine the “new” data that needs to be backed up as it comes in from a client. Since dedupe is happening at the Media Server, the data must traverse the LAN - however this takes the burden of the hashing off the Client which may not be sized for this CPU intensive operation. The hashing can also be shared between Media Servers to reduce the load on a single media server.
In 7.6.1 the catalog was restructured using a SQLite database vs. flat file at the PO level. This allows the catalog to scale larger and provides increased performance. Each client/policy combination now creates a separate database file instead of each deduped object creating a file. This provides better performance and also means that if there is corruption it only impacts a single client/policy rather than the entire database. This is demonstrated in the Test Drive section below.
From a protection perspective, the entire catalog (each of the databases) is backed up as a “shadow” copy daily and by default 5 copies of the catalog are now preserved for faster recovery. In addition writing a change to the primary catalog also writes to a journal file at the same time for redundancy. A corrupt MSDP catalog can be recovered automatically in the event of corruption if auto recovery is enabled (which it is by default). Or, the MSDP catalog can be recovered using the command line either in its entirety or based on the single client/policy file model mentioned earlier.
You can download the full Blueprint from the link below.