cancel
Showing results for 
Search instead for 
Did you mean: 

VSS Writer Fails backing up Exchange 2007 SP1 Cluster

Justin_Howell
Level 3
    Since applying Exchange 2007 SP1, about 2/3 of our Exchange IS (w/ GRT) backups say they complete successfully, but only backup one of our two stores. Each one displays the following error:

Writer Name: Microsoft Exchange Writer, Writer ID: {76FE1AC4-15F7-4BCD-987E-8E1ACB462FB7}, Last error: The VSS Writer failed, but the operation can be retried (0x800423f3), State: Failed during backup complete operation (12).

I just tried the job right now, got that error, and checked the Exchange VSS writers on both nodes of our cluster. Both show Stable and No Error. Prior to applying SP1, we had no error and the same job ran successfully every night for about four months. I did a cursory search of these forums and some suggested updates to the VSS writers ... specifically, KB940349, which I applied but no difference. The Exchange servers are Windows Server 2003 x64, with SP2 and all critical updates applied.

In the event logs on the cluster, I see two warnings and two errors:

First a warning on the active member, ESE BACKUP, 'surrogate backup' stopped with error 0xFFFFFFFF.

Next, an error on the active member, ESE, the backup has been stopped because it was halted by the client or the connection failed.

Next, a warning on the passive member, MSExchangeRepl, the VSS Writer failed.

Last, an error on the passive member, MSExchangeRepl, VSS Writer failed with code FFFFFFFC when processing the backup completion event. (C = 12 = the State in the BE error??)

On the backup jobs that do backup both stores, there are no errors or warnings in the event viewer.

This problem was brought on by the SP1 upgrade obviously ... is this a VSS problem or a cluster problem or what?
8 REPLIES 8

Karri
Level 2
Basicly the same issue here. Brand new Exchange and BE deployment. Seven storage groups in CCR cluster. First storage group fails to backup.
 
There was this previous thread that discussed the issue:
I can manually stop replication service and back up succesfully but thats not really an option as logs doesn't seem to truncate properly and it's awful lot of manual work anyway. With replication service stopped, the backup is taken from the active node, logs are truncated at active node but not at passive node. AFAIK I'm forced to either reseed the storage group or manually clear logs at passive node. I have yet to try suspending replication for the first (or all) storage group before backup and see if it works.
The hotfixes mentioned at the thread seem obsolete and already included in Windows Server 2003 Service Pack 2.
 
I'm personally waiting a few weeks for either Symantec or Microsoft to come up with an solution before creating a support call. Well, forced to wait as BE product keys are yet to be delivered to us.

leonard_boyle
Level 3
We are seeing the same problem with 4 exchange 2007 sp1 ccr clusters. Backupexec is at sp2 Plus hotfixes 31-35. We see on average one failure of a stor group 1 every night. Our backupexec jobs are defined to backup all the storage groups on an exchange server, we go for the passive node, one at a time.
It appears that the failures on always on stor group1 even if stor group 1 is not the first stor group to back up.
 
The funny thing is that the backup reports finishing with no errors but there is a vss error listed in the job.
Not counting the backup error, this seems to be a problem with backupexec.
 
We opened a ticket with symantec and everything the support person asked for was done already.
They did not indicate that anyone else was having a problem.
 
We now have a microsoft case open. At least they know that others are having the problem.
 
It appears that backup has a problem here. And microsoft and symantec could handle the errors better. Esp as backup does not fail the job.
 
Does anyone know if this problem has moved on to the dev group?
 
I plan to either reopen my existing case or open a new one.
Does anyone have a symantec ref number to tie my problem with one already being worked on.
 
Thanks len
 
Thanks len

Ashish_Pandey
Level 3
Partner Accredited
hi all,
i am also facing the same issue as when i backup my information stores seperately they are backuped up sucessfully. most of the times when the backup is from active copy.also i spoke to symanmtec and they told me that it is a microsoft issue and to contact microsoft for the same.

KeyTravel
Not applicable
Hi All,
 
I'm too having actually the same problem and error message as "Justin Howell ",
First the symantec said it is Microsoft fault, but after reading this forum, i'm sure this is to do with the backup exec itself.
 
Anyone got any luck with this issue?
 
thanks

mruiz650
Level 3
My system profile:
 
BE 12 on Win2K3 Server SP2 64-bit
Exchange 2007 SP1 on Win2K3 Server SP2 64-bit
 
I have the same issue. However, I am not using the Continuous Protection option. I am trying to do a Backup to one Storage Group and I'm getting the writer failure:
 
Writer name: 'Microsoft Exchange Writer'
   Writer Id: {76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}
   Writer Instance Id: {a90f8a48-3dab-479b-9e77-ea328e79220a}
   State: [10] Failed
   Last error: No error
 
I did the update (KB940349) with no success. Initially all writers passed (vssadmin list writers) after installing the patch. But now fails after backup failed.
 
No help from Symantec as they say it's a Microsoft issue. 
 
Just wanted to add more fuel to the fire hopefully forcing a fix for this from either Microsoft or Symantec.
 
Still trying to figure this out before I spend $300 on an incident with Microsoft.

bc4393
Level 2

I fixed this.  My backups ran fine without GRT turned on but crapped out with VSS errors when I had it enabled.

 

Turn off AOFO for your exchange backups.   Works like a champ now for me. 

Obrycki
Not applicable

I'm also experiencing this issue with a Windows 2008 Exchange 2007 SP1 running with LCR in VMWare. We started seeing this issue once we upgraded to BE v12.5, didn't have this issue with v12.

 

The workaround we found is to select to backup the Active database instead of the Passive one in the AOFO options of BackupExec. However, I'd like to switch back to the Passive if this ever gets resolved.

 

Luke_Cassar
Level 5
I know this is really old, but perhaps someone can suggest what might have fixed it for them?

Same sort of scenario, Exchange 2007 SP1 on Win 2003 server with SP2 with BE 12 (sp3), getting much in the same errors as the original poster and just wondering what anyone was able to do to resolve this?

As we speak I am running a backup of the active databse to see if that resolves the issue (after rebooting exchange due to failed writer status) but if that doesn't work then I am back to square 1.

Thanks in advance!