cancel
Showing results for 
Search instead for 
Did you mean: 

NBU 7.5 Have very large CIFS Filer Vol - very slow backup

joe1871
Level 3

Hi,

I am writing to look for some way around this problem.  We are using NBU7.5 running between a NetApp filer and a Dell LTO library.  It is networked using fiber between the elements.  We use NDMP as the backup protocol.  NBU is running on a Windows 2008 server and is a single master and media server. 

The volumes I need to back up are CIFS shares.  They serve our customer community.  We have thousands of customers creating many typically small files.  Right now the volume is 1.35 TB with > 7 Million files. I am running a full backup right now.  It has been running for 11 hours and I have backed up 6 GB of data.  Needless to say, this is not going to work.

Is there anything I can do to improve the backup speed of this volume?  My understanding is that this is slow because the backup must record metadata on every file.  Is there any way to minimize or change that behavior (I am sure the answer is no, but I am asking anyway...).  We are trying to archive this data to be sure we have it before deleting a large portion of it.  THat will obvioulsy improve speed, but I must get through this one backup.  Any help would be most appreciated.

 

Joe

3 ACCEPTED SOLUTIONS

Accepted Solutions

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

"SET HIST=N" disables file cataloging. If you can accept inability of individual file restore, try it.

http://www.symantec.com/docs/HOWTO85874

Also using SMTAPE by "SET TYPE=SMTAPE" may help.

View solution in original post

watsons
Level 6

What type of policy are you using?

NDMP (direct backup from filer) or MS-Windows (backup the CIFS share) ? 

If you have different qtree created for different customers, you might want to break it down to more backup selection instead of backup as a whole, and enable multiple data stream , provided you have more than 1 tape drives. This will get your more success chances of individual backup stream, preventing a single job failure and you have to rerun all over again,

Consider advising your customer to zip up files that can be archived - reducing the number of files.

View solution in original post

jim_dalton
Level 6

You are right in your assumptions, but that may or may not be your issue.

Can you run treesize across the cifs data to tell you how things are structured?

Use that to implement backups: eg if theres a lot of data but large files back that separately from other data if you can. If you go through this and end up with a single structure of lots of small files consider multistreaming or better reducing the file count.

I had same setup as you with one folder with millions of files in: I removed this from the backup so I could backup the rest of the data. Then we looked at what was in the problem folder: junk from an application that never tidied up. Implemented a tidyup and millions of files got deleted, problem solved. This one folder took so long to be analysed by the ndmp on the filer, it actually timed out (16 hours?) and never completed.

Take care when considering ndmp directives.

Jim

View solution in original post

3 REPLIES 3

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

"SET HIST=N" disables file cataloging. If you can accept inability of individual file restore, try it.

http://www.symantec.com/docs/HOWTO85874

Also using SMTAPE by "SET TYPE=SMTAPE" may help.

watsons
Level 6

What type of policy are you using?

NDMP (direct backup from filer) or MS-Windows (backup the CIFS share) ? 

If you have different qtree created for different customers, you might want to break it down to more backup selection instead of backup as a whole, and enable multiple data stream , provided you have more than 1 tape drives. This will get your more success chances of individual backup stream, preventing a single job failure and you have to rerun all over again,

Consider advising your customer to zip up files that can be archived - reducing the number of files.

jim_dalton
Level 6

You are right in your assumptions, but that may or may not be your issue.

Can you run treesize across the cifs data to tell you how things are structured?

Use that to implement backups: eg if theres a lot of data but large files back that separately from other data if you can. If you go through this and end up with a single structure of lots of small files consider multistreaming or better reducing the file count.

I had same setup as you with one folder with millions of files in: I removed this from the backup so I could backup the rest of the data. Then we looked at what was in the problem folder: junk from an application that never tidied up. Implemented a tidyup and millions of files got deleted, problem solved. This one folder took so long to be analysed by the ndmp on the filer, it actually timed out (16 hours?) and never completed.

Take care when considering ndmp directives.

Jim