cancel
Showing results for 
Search instead for 
Did you mean: 

speed up backup of mail archive (millions of small files)

bgibson
Level 4
Hi everyone,

Our email archiving solution (homegrown) stores each message as a single file, right now we have about 605 GB made up of just under 10 million files.

We use Netbackup 6.5.3 as our master server (a physical server running Windows 2003 server)  and the email archiving server is a VM (also Windows 2003 server) but for the message archive volume the VM uses Microsoft's iSCSI initiator to talk to an IBM DS3300 which is a low-to-middle end iSCSI device. Unfortunately we have not gotten the DS3300 to do jumbo frames (which might be part of our issue).

The NetBackup Master server is connected through a SCSI cable to a Quantum Scalar50 tape library with 2 LTO-4 tapes drives. We know the tape drives are incredibly fast, when we back up our main email server (which has about 460 GB attached to an external SCSI array) it does a full backup in about 3.5 hours.

When we try and do a  full backup of the email archiving server it is taking well over a day (about 29 hours or so... average speed around 5500 BK/sec) and it is causing issues with other jobs that need to make us of the tape drive(s).

Where should I focus my efforts in an attempt to speed things up? Here are some thoughts...

1. Do we try and get jumbo frames working on the DS3300?

2. Is it the fact that we have millions of small files in which case we should use something like FlashBackup for Windows?

3. Can I use multiplexing or multi-streaming to my advantage? All the files are on one logical volume so I am not sure.

4. Do I scrap the DS3300 and make the archiving volume a big vmdk file on our Equallogic SAN which is where the VM lives?

Thanks for any help you can offer.

- Brian





4 REPLIES 4

Will_Restore
Level 6
Flashbackup for Windows should help with millions of small files.

Nicolai
Moderator
Moderator
Partner    VIP   
But as Wrobbins suggest I would try flashbackup first.

You could use multiple streams to breakup the email archive area, but you need some sort of file structure in place  first.  E.g:

E:\10 mil files (no go, you will get 10 mill backup streams concurrent)

E:\emails_2007\  (Works, you would specify e:\* as file selection)
E:\emails_2008\
E:\emails_2009\

Doing VCB backup as you suggest as option 4 would work also.  You just need to find the solution that fit best for you ;)

EDIT:

jumbo frames, you need to enable jumbo frame support on all the switches/router passing the frames. Can be big challenge.


thesanman
Level 6
Had a similar issue with 1+ million file volumes on one server.  Switched to flashbackup and multistreaming and got a 3x improvement per volume.  Of course you mileage will vary but my feeling is concentrate on flashbackup first.

bgibson
Level 4
For some reason I didn't receive email notices when you all posted your replies so I just wanted to thank you for taking the time to respond. I will look into FlashBackup a little closer and see if it can help me out :)