cancel
Showing results for 
Search instead for 
Did you mean: 

Exchange Backups Fail, one failure fails the other jobs queued to run after it

Ytsejamer1
Level 3
Hey everyone,

I have a disk/tape hybrid storage system.  1TB of disk space is appropriated to backup jobs.  After completed jobs sit on the disk resource for x amount of time, a utility moves it off to tape, clearing the space.  This is not done on a file when it is active.  We access the disk resource via CIFS running via a hacked samba service.  This I know is the issue...but for now I can't get around it.  It is what it is, and the rest of my backups do not have a problem...only very large backup jobs on occasion fail.

However my exchange information store backups sometimes work, and most times do not complete successfully.  I have five servers that the backup jobs kick off around 11pm.  When the first one completes successfully, the rest fail as Netbackup decides to do its clean up after the first one is completed, making the disk storage inaccessible.  When I run the manually, if I choose more than one server to back up, the first one will complete, then netbackup will run its cleanup jobs and the rest trip all over themselves failing miserably.  What's strange is that sometimes I will not have any problems whatsoever.  My public folder jobs run fine (I back up three of the servers) concurrently.

Are there any particular settings that would allow my Exchange backups to continue after the first one completes?  I'm kind of shooting in the dark trying different client settings to maximize the success rate for my disk resource I have available to me to write the backups to.

Any ideas?
3 REPLIES 3

reson8
Level 4

Sorry to say but your description is super vague. Several things to check are:

1. Are you using a basic DSU or a DSSU?
2. What is the status code of the failed backups after the first successful jobs
3. What version of NBU you are running?
4. What version of the NBU Client is installed on the Exchange Server
5. What version of Exchange are your running?
6. You state that a utlity moves this off to tape? Not third party correct?


The image cleanup process shouldn't stop access to the Disk Storage Unit unless it's reached it's High Water Mark and is trying to delete expired images to the Low Water Mark Level.

 

Ytsejamer1
Level 3
Sorry for lack of details..lemme see if I can fill in the blanks.

1.  Not sure what difference is between Disk Storage Unit (DSU?) and DSSU...

2.  I get all sorts of error codes...mostly 84s on failed jubs.  Sometimes 58, sometimes 53.  I did some further troubleshooting looking at buffer sizes and whatnot and it looks like the server is waiting on empty buffers, waiting on full buffers, etc.

3.  Netbackup 6.5.2A...clients running the same thing...all clients including exchange ones are on 6.5.2A with a new binary on each exchange client to allow public folder jobs to run.  This was probably fixed in 6.5.3, but I haven't yet upgraded.

4.  Exchange 2003 Enterprise SP2, all updates.

5.  Correct, not a third party.  We use SAMFS and have a 1TB CIF share presented for backups.  Once it gets finished with files, after so many hours, it'll move it off to tape automatically...but the file system shows the files are still there, but in th background the files may be moved to tape or they may be still sitting on disk.   But no files that are active will be moved off of active disk.

Interesting note on the water mark...each of our storage units (Quarterly, Weekly, Daily) are set for high/low of 98/80.  And while I looked, all of my exchange backups completed fine last night...go figure.  YOU FIXED IT! :)

I know the issue pertains to the CIF share by SAMFS.  One of our engineers had to hack a version of Samba to get it running in Solaris...as their is no native CIFS support.  I carved out a share on one of our netapps and loaded that as a test storage unit in Netbackup, and every backup worked perfectly.  I know the samba is the culprit...i'm just trying to minimize the failures in the existing environment for my backups.  IE...i want to tweak the client and server settings, buffer sizes, etc.

One of my Exchange clients I set for a 256k buffer rather than 16k.  Not sure if that was good or bad, but I don't think it had any positve or negative impact.

schmaustech
Level 5
From the information provided it sounds like you have already isolated the issue to SAMFS.  If the backups work fine on a Netapp CIFS share but not the SAMFS share, the issue is not with the Netbackup, but rather with SAMFS.  You should focus on fixing your issues there and unfortunately this is not the forum to answer those questions.

Regards,

Benjamin Schmaus