Backup & Recovery

For many years, tape was the only way to do backups. To do them fast, fairly reliably, and to allow for offsite storage of data to cater for DR-type situations. Even with the release and improvement of USB drives, tape still had a speed advantage over the first lot of USB 1.1 drives (ever try backing up to 1 of these...?).

At this stage, both software and hardware had not matured enough in terms of low-, mid-, and high-level storage. Disk was still very expensive, and not many vendors had take the road of D2D backups for environments. Over the years this changed...the end result is that many are advocating backing up to disk first before streaming off to tape, or leaving tape completely out of the equation.

So why would disk replace tape in your environment? It's become a lot cheaper, with more advanced features like RAID offering better protection and performance, and software (as well as hardware in some cases) features like deduplication allowing for faster and more efficient backups to run. Disk can now take the form of a USB drive, a NAS, a DAS, or a SAN offering connectivity from iSCSI, to CIFS, SCSI, SAS, SATA or FC. A lot more connectivity than tape can offer. Disk-based backups are generally regarded as being faster now than tape, depending on your RAID level and how your disk has been tuned to offer more performance.

Some of the disk-types are discussed below, and are mainly what I have either tried, or are currently using in our environment:

1. DAS Storage

DAS devices can either be in the form of USB drives, or arrays connected via SCSI/SAS connections. We've used USB 2.0 drives in absolute emergencies where backups to tape have been impossible due to hardware failure. A severe limitation that I have found is the speed involved. In most cases, it has been a USB drive attached to a USB 1.1 port, and backups have absolutely crawled along. Still, a backup is better than a failure, but it does come at a cost. Backups on subsequent days have been missed as a result of backups running overtime. However, when connecting a USB 2.0 drive to a USB 2.0 port, speeds have rivalled LTO2-type speeds which are more adequate.

If you're looking for a cheap solution, USB drives are good enough (configured as Removable B2D in BE 2010 R3), but could limit you in terms of speed and failure rate...you'll only ever be able to backup to 1 drive with no protection.

DAS drives could also take the form of a RAID array connected via SAS/SCSI, or even FC if you have the money for that. Benefits here are drive speeds are faster (up to 6GB 15K SAS drives), with more protection by configuring RAID which increases redundancy. The downsides to this are more cost involved, depending on various factors like the number and type of drives required, and your vendor. For examples of these, you'd be looking at EMC's VNX 5100 (for FC block) or HP's D2000-series disk arrays (for SAS connectivity).

With DAS, however, there is no way in which to share your storage with other servers. If you had a large environment with SQL/Exchange servers that needed fast backups, you'd either have to back up to a central server (much like SCSI-attached tape) or on each server via internal disk or DAS-disk which shoots the cost of BE up with more media servers required, and adds complexity to it all in terms of management.

2. NAS Storage

This is 1 area of storage that has improved over the years. NAS devices are becoming more main-stream and seem to be bridging the gap between entry-level and mid-level storage. They're cheaper than SAN-type storage, and offer the additional benefit of redundancy in the form of RAID configurations, depending on which vendor you're looking at.

This allows multiple servers to target a NAS (via CIFS/iSCSI protocols) simultaneously, cutting down backup times substantially in some cases.

We've got 3 Iomega NAS devices in our environment (2 x Iomega StorCenter Pro ix4-200r, and 1 x Iomega StorCenter Pro px4-300r, with more on the way!). Initially backups weren't possible to it. I could never get BE 2010 R2 or R2 to access it properly, but since moving to R3, I've been able to configure backups to this device if need be. The end result has been speeds offering the same performance as the tape drive on site, and a very viable alternative to tape. I haven't tried the dedupe option offered in BE 2010 R3, as currently that isn't under investigation.

For the price, we've got good redundancy, decent backup speeds and lots of space (at least 6TB). Working out slightly cheaper than tape drives, we're in a push to get more of them in as we move along.

The downside to this is support...if your NAS isn't listed on the BE 2010 (or previous versions) Hardware Compatibility List, chances are good it won't work, and if it DOES work, any issues will get the: "It's not a supported configuration" line from Symantec (or any other vendor in the same position). Secondly, backups are still LAN-based...you're still targetting the LAN to get backups through no matter what the time they run at.

3. SAN Storage

The best of the best, if you can afford it...SAN storage is ideally best utilised with FC connectivity, as coupled with SAN SSO and servers that have very large amounts of data to be backed up, you can end up with LAN-free backups where data doesn't touch your network. SAN backups would be faster (using either 4GB or 8GB FC connectivity), allowing even more disks to be targetted, with the ultimat configuration being able to stream off to either another array or an FC-connected library which completes the LAN-free backup process.

Cost here though is a major consideration. SAN attached arrays from companies like IBM, HP and EMC can cost a lot more than other storage, so the performance/cost issue needs to be properly considered.

We used SAN SSO to backup a file server with a couple million files and around 600GB of data in total, along with an Exchange server with 4 x 100GB Information Stores and it cut backup times down by around 60%. Times will vary, so don't quote me on that applying to your environment, but it saved a lot of time in the long-run.

 

So where does tape fit into all of this? Off-site storage for backups can be taken care of by doing a D2D2D strategy where data recently backed up to disk is duplicated to another array elsewhere on your site, or across the WAN. No tape needed here.

Cost vs. performance...tape in my mind is a better option than disk if you're considering a USB drive for instance, or for some reason have a NAS not on an HCL which rules it out of the equation.

Some auditing firms and other companies require offsite storage where data is stored in a safe, either in another building or at a storage company. Disk wouldn't do the trick here, and tape is still viable. You could still look at following a D2D2T policy where backups are first done to disk, and then streamed off to tape before it is vaulted.

Tape isn't entirely dead, and the way the LTO format is being constantly revised, it might never be. But it might very well be playing less of a primary role in backups and eventually be relegated to vault/second-tier, or even third-tier storage, especially as disk becomes cheaper in all formats.

However, if you're not considering disk in your environment, you should be...it's the way of the future (in my opinion).

Why not check out Backup Exec 2010 R3 if you haven't already and see what it has to offer...

Comments

If you are thinking of rotating your disk, check out my article

https://www-secure.symantec.com/connect/articles/how-rotate-external-harddisks

As one of Symantec's competitors points out in a webinar entitled "The future of backup", one of the key issues with tape is that it has not kept up capacity wise. From memory, so don't quote me, the ratio "How many disks you can backup to a single tape?" has gone from hundreds of disks per tape to 0.75 disks per tape.

Price per GB of LTO-4 tape here in the UK does compare well with 1TB disk drives (£13 vs £31) but the price of the media is only part of the equation. Hot swappable SATA disk caddy is pennies compared to an 8 slot robotic tape library and of course, reliability is key to that nice warm feeling with backups. Which I'm afraid tape has never given me. Our Quantum SuperLoader 3A has been replaced three times in as many years.

People's requirement for backups have changed as well. We have three main requirements:

  1. Disaster recovery: building burns down. Best choice here is replication to another site and tape scores VERY badly
  2. Restoring deleted, corrupted, accidently modified file by users: tape doesn't score very well here either as you probably have to retrieve the tape from off-site. B2D and de-duplication to cheap storage is a much better solution. This is the restore task day to day that everyone will do at some point
  3. Long term archive: possibly the only place for tape these days but that said, a big SATA disk farm with de-duplication could probably satisfy this for many years. And honestly, do you think that tape you put away six years ago is still readable, you have the hardware to read it or the software?

If our tape drive blew up tomorrow (ignoring we're on maintenance), I wouldn't replace it.

Cheers, Rob.

...we're actually suffering badly enough in our main site to start looking at EMC DataDomain and Avamar as a total replacement for tape.

Should be interesting to say the least Smiley Happy

Craig, OK....so you want to go "tapeless" and use DataDomain?  That would mean paying double for the disk (one at your primary site and again at your remote replication destination) PLUS paying for a WAN link that is big enough to handle the traffic for the replication.  And, yes, I realize that you will only be transmitting the changed/new blocks....but, consider that you will have new data as well as the changed data.

Now you have the data at the remote site....and you lose your DD at your primary site...now what?  You perform restores from the remote back to the primary?  Or do you go there with a truck and haul it back to your primary site?

It's a pretty expensive alternative to tape.

SATA disk space is cheap. 16 disk JBOD enclosure and controllers (perfect for backup) filled with 3TB SATA-2 drives (48TB total) is £3,300. Downsize to 2TB drives giving 32TB backup and the price plummets to £1,600.

Nobody serious about DR and failover considers restoring back from the other site. They failover to that site instantly and then sort out the core problems at their leisure.

Yes, re-seeding the other side is an issue but how much space are we talking about for the average company? We've got 150 employees and our data could fit comfortably on two 3TB external USB-3 drives esp. if compressed. We'd stick TrueCrypt on there, copy at 140MB/s via USB-3 and courier them overseas.

It's not just about cost. In fact, in terms of business continuity, if you are constrained by cost, your managers have missed the point.

Cheers, Rob.

>and you lose your DD at your primary site

Buy another JBOD at your primary site and robocopy the whole lot onto that once a week/month. In case of disaster, you've got 90% of your data and can let replication sort out the rest.

Cheers, Rob.

...vs a week with no backups due to an issue with the backup software? Our client is big enough to afford it all...that's why we're going with it. Besides that, my company is in a strategic partnership with EMC...we have Platinum support so getting things fixed isn't really a problem at all.

We're fully aware of the expenses, and we're fully aware of what having no backups is also capable of...hence DD.

@rob, there's a reason why people use a DataDomain over JBOD....it's called deduplication.  But, if you are going to use that at your primary and remote site you have to be aware of the expense.  And, in CraigV's case, it sounds like the customer is more than willing to cover that.

Cost should be a factor in the discussion.  You should be able to determine the cost of having key applications down for an extended period.  That will give you an idea just how expensive the data protection solution should be.  If you don't take cost into consideration, then every application in your company will say that they absolutely need the highest level of protection and that's just asking for trouble.

Still, just having the data replicated from one site to another is only part of the solution.  The other part (the harder part) is being able to recover from a disaster at your primary site.  If you don't have those details covered up front, then it will all be for nothing.

While this isn't an NBU issue, we have had 2 bad RAID controllers which have marked multiple disks bad at the same time.  This took down the RAID set we use for our MSDP.  We were hit at both out Primary and DR site within 6 months.  This storage system is made by a very reputable vendor so we were surprised to have this issue not once but twice.

We lost 4 weeks of on disk retention.  All of my backup jobs are duplicated to tape "just incase".  I had the "just incase" happen but I can tell you my management that hated paying for a couple of LTO drives loved that I fought them to spend the extra money.  We have 2 very important restores for Audit purposes that I would not have been able to do if I didn't have tape.

As for Data Domain, we moved away from them because of the cost.  We got 4x the amount of RAW disk space for half the price of buy a Data Domain box.  We then setup Storage Lifecycle polices to duplicate data and this works just like to Data Domain boxes we retired.  Data Domain is a good product but to me with SLP and MSDP, Data Domains are over priced now.

Hi Stu52

 

We use the Media Server Duduplication Pools (MSDP) with NBU7.  This allows you to present storage to a server as a normal hard drive.  While the data is backed up to the MSDP it is deduplicated.  As I posted in my previous post I got 4x the storage space for half the cost of purchasing a Data Domain.  The data one the MSDP is then replicated to an MSDP at the DR site, and like a DataDomain, the deduplicated data is sent across the WAN, not the entire backup set.

If you haven't yet, check out the MSDP options in NBU7.  It saved me alot of money and gave me a lot more. 

 

Well, all I can say is this.  I am working at a company where there are several sites, but the main site does just over 500TB/month and it is behind in the replication to one of the other sites.  The problem is that the current system was undersized.  It is currently running near 80% and we need to add another 900TB of backup volume to it....it's not going to fit.

If bandwidth were not a problem we would not be running behind in our replication.  We have maybe 2 weeks capacity left and we need to do something different besides the replication.  Now we are looking at going to tape for some of the less critical backup policies and we have to figure out how we can make that happen.

Personally I don't see how the cost can be justified.  You pay double for the disk storage and then pay even more for the WAN connection.  I'm sure it works OK for smaller volumes, but it does not seem to scale very well when you get into 125TB/week or more.

What technology are you using for your replication?

Cheers, Rob.

One problem with disk is that you got spin it up every now and then.  Otherwise, the bearings and so on will seize up.  I think disk is good if they are used constantly, but if you need to keep something for years on end, I would not trust disk.  Tape would probably last a lot longer if they are properly kept in a climate-controlled environment.

Aha...and this is where CAS systems like EMC's Centera comes into play...lots of RAIN protection on your data means it theoretically can last for years.

It has to do with the physics of recording magnetic data.  See W. Curtis Preston's article on this here:  http://backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/380-tape-more-reliable-than-di...

This is why tape is much better for long-term storage than disk.  Don't believe it when a disk vendor tells you that your data will last on the same set of disks "for years"....it won't happen.  Bit-rot is real.

DD prevents bit-rot related errors with it's garbage collection (no other vendor does this that I know of).  This makes the DD arciver interesting as a solution?

Tape is more reliable with the caveat, it's written to at the minimum speed of the tape format or greater, and that the tape is stored in the appropriate climate controlled room.  Which without doing those two conditions alone, you have just reduced relibablity by greater than 25%.

Perhaps the cost of tape's $$/PerGB is lower, but that doesnt account for the library itself, and moving to newer tape formats every 3 yrs, let alone the labor costs of managing tape daily, and during tape technology refreshes...

 

Teiva-boy

The "garbage collection" or "cleaning" that you mention above is not something that will eliminate the bit-rot problem.  Bit rot is something that will occur with any spinning media (disk).  Furthermore, other vendors of deduplication devices also perform a clean-up similar to the one that DD performs.  This is how space is reclaimed after data ages out.

As for the costs of tape, even when you take those costs into account tape is still cheaper than disk.  Lest you think that you can keep disk around indefinitely, disk manufacturers are constantly having you migrate your data from older disk to newer disk every 3-4 years.  The tape media itself can keep data intact, without bit-rot for as long as 30 years.

On the subject of garbage collection - my understanding is that if you had overwrite set to six months, any blocks that hadn't been referenced in a backup within six months were marked as free and then re-used. So backup a file today, then delete it and six months later the reference to it in the catalog was removed and the blocks it had used marked as free. Actually, I would imagine each block has a usage counter that decrements and only when it hits zero, is the block marked as free.

But in terms of bit rot - does BE do something additional and any block that's not been written to for a period of time is read and written back? Is this a house keeping job or is it assumed to be part of the backup. So each time a 64k block is re-used (e.g. as part of a backup of a long standing file), BE checks when it was written and if it's over a certain date, re-writes it?

Cheers, Rob.