cancel
Showing results for 
Search instead for 
Did you mean: 

When to switch to FlashBackup?

Seth_Bokelman
Level 5
Certified
Hello, our Symantec sales guys have been talking to us about moving some of our file servers to the Advanced Client so that we can use FlashBackup to increase backup speed on servers with many small files. We've never used Advanced Client, and while I intend to do some testing, does anyone have a guideline about when you should consider this?

For instance, one of our typical file servers has 200GB of data, broken up into about 990,000 files, is this scenario likely to benefit much from the advanced client?

Thanks!Message was edited by:
Seth Bokelman
9 REPLIES 9

Seth_Bokelman
Level 5
Certified
Oh, and if it makes a difference. Our backup server is a Sun v480, hooked to an L180 tape library with 3 SCSI LTO-2 drives in it. File servers and the backup system are both on Gigabit ethernet.

DavidParker
Level 6
Seth,
Good question. I think the answer depends somewhat on how your data is structured and how often it changes.

My company has a system that has about 1.6TB of data on it now (adds about 5-10GB per day) but the existing stuff rarely changes. Initially we saw problems with differential backups, but since things rarely changed we made some adjustments to work around that. Now we run a Full backup once every 2 weeks and in the meantime we have a script that runs once a day that does a user-initiated backup of the past couple days of data.

So, what is the nature of your files? Do they change frequently or are most of them merely being stored? Seems like each file is about 200KB, is that correct? Are you experiencing any specific problems or just long backup times in general?

Another thing you could consider: breaking down the Backup Selection and doing 'chunks' of the server separately with Multistreaming.

DP

Seth_Bokelman
Level 5
Certified
The data doesn't change very much, typically the incrementals are about 4-5GB per day on the server in my example above. The incremental backups are fine for speed, but we've been looking at some of our bigger file servers and wondering how much faster they can go if we eliminate the overhead of smallish files during their weekly full backup. None of them are in the "millions" range in terms of files, but the servers have up to 600GB in total data, typically all in one or two volumes.

No real specific problems, other than long backup times while we wait for the server to feed all the data across the network. We're a University, and the file servers contain the typical mix of Office documents, old e-mails, photos, etc.

Lance_Hoskins
Level 6
Hey Seth,

UNI? :) I'm from Iowa originally, so that's what I'm guessing since you said you're at a university. Anyhow, my first question is what's the hardware configuration like? What kind of network, server, hardware are you running on?

I only ask this as we just upgraded from what used to be the top of the line hardware in it's day to some new HP DL380 G5's with SAS SFF drives at 5x146GB RAID 5 at 10K on a 1GB network. These are our file redirect servers which have 4 data volumes each with data such as what you listed below and ranging from 400-900 thousand files each. We went from approximately 3-4MB/sec (compressed) on our full backups to 11-12MB/sec (compressed)! We did some tests around this finding and it was 100% hardware in our case that bumped the throughput (SAS drives being the key--not to mention a fresh set of data laid on the drive with no fragmentation--which we plan to maintain going forward).

Anyhow, back to your original question about the Flash Backup option. I just got through testing it for a couple of clients I have with 50 million 1k files on a single volume. The backup worked well getting me up into the 5-6MB/sec range, but I couldn't get a restore to work for the life of me. I went up to 4th level support with Symantec, but basically got nowhere. It sounds like there's already another issue out there like mine that engineering is looking into, so I'm holding off for right now. Bottom line, test backup and restores when evaluating flash backups.

That's all I've got for now!
Lance

Seth_Bokelman
Level 5
Certified
Yes, I work at UNI, though I graduated from ISU myself. :)

Look up at the second post in this thread for a basic overview of what we're running. We're on NetBackup 6.0 MP3.

Thanks for the additonal info, I just realized one thing that the sales guys didn't tell us, that it backs up the "empty" space on the drive too. That's rather important to know...

DavidParker
Level 6
Seth,
Thanks for the updates.
I think, depending on how much work you've already done on this, you may be able to speed things up by changing the structure of your Backup Selection.

You said most of the files are in a couple of volumes? How does that look?
\vol1\ ?
\vol2\ ?

How is your Backup Selection setup currently for this server?
Have you tried splitting some of the selection up and running multiple streams?

In my example from my company, we used to run that server as 1 job and it took 2 days to run the full backup. Now we split the folders out to separate jobs and let 4 of them run at once. We shaved about half the time off the backup with this setup.

DP

Seth_Bokelman
Level 5
Certified
Actually most of them are:

Volume 1: Operating System

Volume 2: Data

And they're usually across the same physical platters, so multi-streaming "would be bad" from everything I've read.

We're currently looking at both Advanced Client functionality, as well as the possibility of turning some of the large (400-500GB) servers into SAN Media Servers if we buy some FC drives in our next library.

DavidParker
Level 6
Nah, multistreaming isn't bad.
You just have to be careful with it (obviously don't get too many streams going at once).
It all depends on what your hardware can handle and how fast your network is.
If need be, start small (2 streams) and work your way up.

On my system, before multistreaming, the 1 job would push about 5000Kbps. Now we get 4 jobs running at about 4000Kbps each. No problems.

eric_lilleness
Level 4
I'd have to admit to struggling with a similar problem.

My main problem statement is that some of our backups take too long. Most of these "problem" backups fall into the category of millions of very small files. Thruput as measured by NBU is well under 1 Mbyte/second; right now I have a backup that has been running for 170 hours. With our millions of small files (via NFS from a netapp to make it worse), directories outnumber files by about 2:1, so we have a lot of overhead (extra disk reads) associated with backing-up these files. It is my belief that Flashbackup would be the best solution.

Given our lack of 24x7 staff, I have picked 8 hours as a goal for backup duration so that when I come in in the morning I could have time to re-run a failed job before it was scheduled to run again.

In my last job I was designing/implementing storage systems. Most folks with large amounts of data ( 3-4 Tbytes or greater) are using SAN media servers. Using this technique you can provision a LUN/logical volume with a filesystem that can stream data at tapedrive speed no problemo; given your 3 drives you could get over 150 Mbytes per second easy & have a 1 hour backup window - probably overkill. Also, to do a SAN Media Server you need to perform some sort of 3rd mirror break-off & mount this 3rd mirror on the SAN Media Server. Most folks on Sun use Veritas Volume Manager "Flashsnap" to do this:

(1) create 3rd mirror (vxvm snapshot, EMC BCV, whatever)
(2) split vxvm diskgroup so that all the 3rd mirrors are in a separate diskgroup
(3) Zone the SAN so that only the SAN Media Server can see these LUNS
(4) Import the diskgroup with the 3rd mirrors onto the SAN Media Server, mount them as filesystems
(5) Do the backup
(6) reverse the process (unmount, import back, re-sync volumes)
(7)repeat

I am pretty sure that the Advanced Client Flashbackup over a LAN will meet your performance requirement for a lot less money & script writing than the SAN media server/Flashsnap option.

Of course, if cost is no object you could use both techniques!!

I would be very interested in the actual thruput values other folks are seeing with the "millions of small files" issue & any before/after Flashbackup values