We started trying FlashBackups when they didn't actually work from a Solaris master to a Windows client. That's been fixed for many years.
Initially, we were backing up 250GB volumes with about 10 million files on them on Windows directly to SDLT tape and it would take about 24 hours. We switched to DSSUs and cut the time in half. We then went to FlashBackup and cut the time in half again to ~6 hours. We work with a ton of scanned TIFF images and have a few applications with over a billion files.
Today we routinely back up 1 TB volumes from both Solaris and Windows clients and we still have a lot of volumes with >40M files per TB and hostile directory structures. All of the clients have GigE connections to a media server. We use LTO-3 drives now front-ended by Decru encryption appliances.
We're run into several roadblocks along the way:
- System administrators forget (conveniently?) that snapshots are an OS responsibility and not a backup responsibility. So when they get a status 156, they ask the backup team why "our" snapshots failed. Patch the OS and configure it properly is a common answer from us...
- The parent/child job relationships play havoc with FlashBackups because the snapshots are created by the parent, not the job. In many cases, the child may not run for a LONG time after the parent creates the snapshots so we were forced to create a separate policy per mount point. I've had a single client with over 50 policies because of this.
- A large non-FlashBackup restore will usually fail because the master can't enumerate the file list. I haven't tried it lately and we don't have much need to restore an entire volume this way - it's usually small subsets of the data that need to be restored. If I had to do it again, I'd provision a full volume and do an image restore, not a file-based restore.
I haven't had a chance to work with the SAN client to see if that will improve performance. It's on my TODO list...
In general, we're trying to get away from host-based file serving so we'll hopefully phase out FlashBackup completely in favor of NDMP backups from our NetApp filers as our data continues to migrate. We're also working towards full application replication to our DR factilities so that the requirements for tape will diminish. I can't imagine how long it would take to roll a billion files back off of tape :(