TechTip: There’s More Than One Deduplication Appro...

Turls · ‎03-17-2009

Data growth rates of 30% to 50% per year are straining the network and storage capacities of organizations, a phenomenon that is compounded by the large amount of duplicate backup data routinely stored. This exacerbates the time and effort required to recover data from tape, driving many customers to evaluate how they can use disk to improve recovery and reliability.

Most organizations are accustomed to using disk as a staging target to temporarily hold data before moving it to tape. Although disk-based systems are generally faster and more reliable than tape, storing all your backup data on disk for the full retention period can be cost-prohibitive, especially if you plan to keep the data on disk for disaster recovery as well. To address these concerns, data deduplication is one way to improve disk-based backup.

Data deduplication defined

In general, data deduplication involves looking for redundant instances of backup data at a sub-file or block level across all backup data sets and locations, thereby allowing companies to reduce the amount of storage needed for backups by deleting duplicate data.

Based on a company’s backup needs, data deduplication can be deployed at different points in the backup process

Client-side deduplication takes place at the start of the backup process on the server to be protected.
Target-side deduplication is performed before the backup data is written to disk.

In environments where bandwidth is limited, client-side deduplication can be used to reduce bandwidth consumption and optimize storage. The result is significantly faster backups because up to 90% less data is moved. If LAN or WAN bandwidth is not an issue during backups, organizations may find it easier to deploy target-side deduplication because it requires less change to an existing backup architecture.

Where to start

How do you know where to start when considering deduplication? In general, if network-based backups of physical or virtual servers are causing you to miss backup windows, then you should consider client-side deduplication with NetBackup PureDisk. If you have remote offices with both backup applications and tape or disk infrastructure, then again, you should consider client-side deduplication as a means to centralize and consolidate this data back to your data center before committing it to tape or other backup devices. Finally, if you’re focused on faster recovery, reducing tape utilization in the data center, or electronic vaulting of backup data, then you should consider target-side deduplication, as offered by the NetBackup PureDisk Deduplication option, to reduce the amount of data you move to and from backup devices.

Existing NetBackup 6.5 users can access target-side deduplication from PureDisk because the engine is integrated into NetBackup. All they need to do is allocate a single physical or virtual server and storage to begin using deduplication (an additional disk license is required beyond trial period).

Finally, for customers who prefer to tackle these challenges with a storage appliance, NetBackup provides a unique feature through its OpenStorage initiative. When using devices from partners such as Data Domain, Quantum, FalconStor, and others, customers can centrally manage backup data at both central and disaster recovery sites and control advanced backup services like replication. With more features planned for the future, OpenStorage simplifies management of data and helps customer optimize their data center assets.

The ability to efficiently replicate backup data to offsite locations is already changing how customers execute both backup and disaster recovery plans. In the future, the choice will not be whether you use deduplication, but how you use it with your backup application.

Related Links

Symantec White Paper: Library on Deduplication Using PureDisk

Article: Symantec Takes Deduplication Debate off the Table

VOX

TechTip: There’s More Than One Deduplication Approach with NetBackup