Showing results for 
Search instead for 
Did you mean: 


De-dup champion

Without a warning, need for a change

A medium sized group of servers co-located in a big data centre.
Backups have been running fine locally, using IBM TSM infrastructure.
All of a sudden, some of the servers need to be relocated to another data centre in another location.
And the good, well-established backup infrastructure is a bye-bye now, a new backup environment had to be built up from scratch for the servers both in local and remote data centres.

Plan it well, compare well

That’s what we had to face late last year.
25 to 30 Windows servers, few of them also hosting SQL database, we had to find another backup solution for the servers.

Fist of all, the new backup solution had to perform well over the WAN link.
Only few of the servers were going to be relocated to the new data centre, while majority of them will be staying in the current location.
That meant we had to either distribute the backup service across two separate locations (still under one backup master’s umbrella) or make the backups run over the WAN link with only one location hosting backup service.
Instead of worrying about distributed backup servers and arranging tape hardware, it’s been decided to use the source based de-duplication and backup the de-duplicated data over the WAN link, and disk based backup solution to be used which will reduce cost on hardware purchase.

Secondly, the backup solution should be able to provide security feature, like encryption on the backup data before the data is transmitted.
Because the backup data will be travelling over the WAN link, new backup solution needs to have an encryption feature.

And lastly but most importantly, it had to be a scalable and flexible one with the existing backup infrastructure in the new data centre.
The new data centre, where the more important and mission-critical servers will be relocated to, already had VERITAS NetBackup Enterprise Server environment running, with a well established hardware support for tape library.
This could not be wasted, if the new solution is going to be a disk based one, and a source based de-duplication, it also needed to be exported to tape for the second level protection and offsite purpose.

VERITAS PureDisk fitted perfectly in to this scenario – it can export backup data to VERITAS NetBackup, or do a disaster recovery backup to the same, we did not have to worry about (and possibly spend tens of thousands of $$ on) new tape based hardware to protect the new backup solution’s data storage space.

Later on it was also found that it has ease of mgmt in department/location unit level.
And for the restore, it has an option to download locally or restore to back to the client.
Combined with Web based administrator interface, that means restore to virtually any system we want, there’s no need to have an agent installed to do the restore.
That’s a flexibility, yes – and it is a scalable solution too because the storage pool engines are broken in to few parts which can be added on later.
More content router - which holds the backup data - can be added to increase the capacity of the storage pool and at the same time contribute to the load balancing.
Dedicated metabase engine also means better performance for the de-duplication processing and catalog enquiries.
And the fun part of running it is that the whole package is in PostgreSQL database, which means industry standard SQL syntax can be used to manage the backup configuration/operation and catalog.
UNIX cron is used for the job scheduling, which is very simple and virtually error-proof in initiating backup/maintenance tasks on time.

A software appliance is here

It finally arrived, we were ready to set it up.
PDOS - PureDisk Operating System - installation was not very much difficult, only few things had to be arranged on the hardware level before we go ahead with the installation.
The PDOS installation offered several different choices too, we could just make a choice and click next, click next.
One mistake we made though, was to allocate small amount if disk space first and we decided to allocate more space as the amount of the backup data increases.
Packaged together with VERITAS Volume Manager and VERITAS File System, we have flexibility of dynamically adjusting the volume size with newly allocated LUNs but it was still a bit of work to do and we ended up logging a call to get assistance from Symantec Tech Support who helped us so quickly and efficiently.
Although we tried our best at the design stage to estimate the disk space demands, the data growth rate on the client side was just exponential and we found the disk space requirement grows so quickly even with the good de-duplication fully working.
Also, it was not very easy to make all the parties understand that the backup will not be lightning fast because it happens over the WAN link.
Looking at it from different angle, we are actually very lucky to have backups completed within the backup window over this slow WAN link – all this thanks to the good de-duplication on the data source before the backup data being sent across to the storage pool.
Come to think of it, it’s actually very well priced for a solution that has OS, volume manager, file system, backup application all in one package.

Yes it could’ve been even better

To avoid unnecessary headache and workload, allocating the full capacity at the very first place will be a good idea in real life situation, this will make both the backup servers and administrators happy and show a very good performance in general.
One interesting feature that PureDisk has is keepting the last copy of data after retention period has passed.
Simple analogy to a tape based backup solution goes like this>
A backup is written to a tape, with 30 days retention period set to it – be it a full backup or incremental/differential one.
After 30 days, the tape will be expired, and it will be treated as an empty tape/resource which can be recycled for new backup data to be written on it.
By design, PureDisk keeps the last copy of the data segment even after 30 days retention period, until the dataselection itself is removed from the configuration.
That’s another reason to allocate enough space on your storage pool at first place, disk space will not be freed up as soon as you expect it to be.

After running it for 6 months, it’s shown that it is a very good backup solution which de-duplicates in good ratio and completing backups with security within the desired backup window over the WAN link.

We do have all we want – this is VERITAS NetBackup PureDisk.

I found the follwoing bits interesting.

  • Producing reporting for daily backups without VBR
  • Producing and testing  a DR solution for PD
  • Intergratng  into MS Active Directory & LDAP  - you don't have to do any recoveries let the admins/users do their own.

Hello ========================================================== Backups have been running fine locally, using IBM TSM infrastructure. All of a sudden, some of the servers need to be relocated to another data centre in another location ====================================================== During this process , did you folks have to convert any existing TSM Archive tapes to Netbackup ( I mean transfer data from a TSM Tape to a Netbackup Tape ) ? Any advise is greatly appreciated Thank you