Forum Discussion

Altria92's avatar
Altria92
Level 4
14 years ago

Exchange 2010 DAG fails

I recently had jobs running successfully on NBU for Exchange 2010 DAG. Recently It began failing midway into the backup:

4/27/2011 12:02:52 PM - positioned 0113L3; position time: 00:01:02
4/27/2011 12:02:52 PM - begin writing
4/27/2011 12:04:08 PM - Error bpbrm(pid=3656) from client ex1.bd.com ERR - Terminating backup.      

4/27/2011 12:04:08 PM - Error bptm(pid=1576) socket operation failed - 10054 (at child.c.1293)      
4/27/2011 12:04:08 PM - Error bptm(pid=1576) unable to perform read from client socket, connection may have been broken
4/27/2011 12:04:08 PM - Error bpbrm(pid=3656) from client mem.hunter.cuny.edu: ERR - failure reading file: Microsoft Information Store:\mdb00\Logs_1303919951 (BEDS 0x0: )
4/27/2011 12:04:08 PM - Info bpbrm(pid=3656) DB_BACKUP_STATUS is 13          
4/27/2011 12:04:39 PM - Info bptm(pid=5224) EXITING with status 42 <----------        
4/27/2011 12:04:39 PM - Error bpbrm(pid=3656) could not send server status message       
4/27/2011 12:04:42 PM - end writing; write time: 00:01:50
network read failed(42)
4/27/2011 12:04:47 PM - Info bpbkar32(pid=10268) done. status: 42: network read failed      

I say midway since, it does begin to write (see bold) data. Anyway the suddent termination is disturbing as nothing on the Network side has changed.

IN addition, on a side note. It appears that when I select for only the passive copy of the DAG to be backed up it still appears to back up both. In fact the one that is failing is the passive, while the active DAG shows sucessful.

  • It turns out that the issue did appear under the system logs. A corresponding error via Snapdrive appeared stating snapshot log deletion failed. This coincided with the specific failure times when NBU backups failed.So it seems that SME and NBU are stepping over each other via VSS.

    HTH someone encountering the same issue.

9 Replies

  • Thanks Zahid, but I have seen that document before and confirmed that the media server is listed on the master server.

    As I mentioned the backups have been running up until snapmanger was installed on the Exchange mail servers. I believe that there is a conflict with VSS which both application seem to use. The problem is that I cannot set the VSS to use some other port or communication channel other than what seems to be occupied by Netapp SME. I wanted to know if anyone has encountered such a problem when using two application under VSS.

  • Please ensure that you have bpfis and bpbkar log on both nodes.

    Snapshot problems will be logged in bpfis log.

    Please also let us know your NBU version - important fixes for Exchange 2010 in NBU 7.0.1.

  • As Marianne said that share the logs meanwhile let us know that are you satisfied with the Firewall on Exchange machine ? means is not firewall blocking ?

    If possible Stopped the firewall (just for checking)

  • Version is 7.1. I am not aware of any hotfixes for Exchange 2010 and NBU version.

    As i mentioned, it was working prior to known change which was the install of Netapp Snapmanger for Exchange (SME). Below is snippet of some of the logs:

    bkpbar:

    4:21:40.204 AM: [4136.1544] <2> tar_base::V_Close: closing...
    4:21:40.204 AM: [4136.1544] <4> dos_backup::tfs_reset: INF - Snapshot deletion start
    4:21:40.204 AM: [4136.1544] <2> ov_log::V_GlobalLog: INF - BEDS_Term(): enter - InitFlags:0x00000001
    4:21:40.220 AM: [4136.1544] <16> dtcp_read: TCP - failure: recv socket (620) (TCP 10053: Software caused connection abort)
    4:21:41.234 AM: [4136.1544] <16> dtcp_read: TCP - failure: recv socket (620) (TCP 10053: Software caused connection abort)
    4:21:42.248 AM: [4136.1544] <16> dtcp_read: TCP - failure: recv

    ...........................................................................................

    4:22:17.751 AM: [4136.1544] <16> dtcp_read: TCP - failure: recv socket (616) (TCP 10058: Can't send after socket shutdown)
    4:22:17.751 AM: [4136.1544] <2> dtcp_close: TCP - success: close socket (616)
    4:22:17.751 AM: [4136.1544] <2> dtcp_close: TCP - success: close socket (292)
    4:22:17.751 AM: [4136.1544] <2> dtcp_close: TCP - success: close socket (620)
    4:22:17.751 AM: [4136.1544] <4> OVShutdown: INF - Finished process
    4:22:17.751 AM: [4136.1544] <4> WinMain: INF - Exiting C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe
    4:22:19.780 AM: [4136.1544] <4> ov_log::OVClose: INF - Closing log file: C:\Program Files\Veritas\NetBackup\logs\BPBKAR\050111.LOG

     

    Here is a successful job:

    4/28/2011 4:03:50 AM - Info bpbrm(pid=3540) DB_BACKUP_STATUS is 0          
    4/28/2011 4:03:50 AM - Info bptm(pid=4440) waited for full buffer 281 times, delayed 288 times    
    4/28/2011 4:03:56 AM - Info bptm(pid=4440) EXITING with status 0 <----------        
    4/28/2011 4:03:56 AM - Info bpbrm(pid=3540) validating image for client ex1.db.com       
    4/28/2011 4:04:00 AM - end writing; write time: 00:00:14
    4/28/2011 4:04:05 AM - Info bpbkar32(pid=10716) done. status: 0: the requested operation was successfully completed    
    the requested operation was successfully completed(0)

    Failure:

    M - begin writing
    4/29/2011 4:16:11 AM - Error bpbrm(pid=5956) from client ex1.db.com: ERR - Terminating backup.      
    4/29/2011 4:16:11 AM - Error bptm(pid=2868) socket operation failed - 10054 (at child.c.1293)      
    4/29/2011 4:16:11 AM - Error bpbrm(pid=5956) from client ex1.bd.com: ERR - failure reading file: Microsoft Information Store:\Public Folder Database\Logs_1304064028 (BEDS 0x0: )
    4/29/2011 4:16:11 AM - Info bpbrm(pid=5956) DB_BACKUP_STATUS is 13          
    4/29/2011 4:16:11 AM - Error bptm(pid=2868) unable to perform read from client socket, connection may have been broken
    4/29/2011 4:16:16 AM - Info bptm(pid=5152) EXITING with status 42 <----------        
    4/29/2011 4:16:16 AM - Error bpbrm(pid=5956) could not send server status message       
    4/29/2011 4:16:18 AM - end writing; write time: 00:00:17
    4/29/2011 4:16:23 AM - Info bpbkar32(pid=12744) done. status: 42: network read failed       
    network read failed(42)

  • Similar problems posted previously on NBU and BE forums:

    https://www-secure.symantec.com/connect/forums/backup-exchange-netapp-snapmanager

    https://www-secure.symantec.com/connect/forums/got-open-ticket-w-symantec-will-post-here-too-exchange-2007-w-backupexec-2010

    Neither of them solved.

    Best to log Support calls with Symantec and NetApp.

  • Check the system event logs for any volsnap errors,  it might be there is not enough room on the drive for the vss snaphots.

  • Do one more thing for just checking

     

    Go to Netbackup Server Host Properties where you will find Master, Media and Clients. Doucle Click on Client and then double click on the specific client. If the properties will open then i thing which is in my mind will clear that the Firewall is not the Bone of Contention

  • It turns out that the issue did appear under the system logs. A corresponding error via Snapdrive appeared stating snapshot log deletion failed. This coincided with the specific failure times when NBU backups failed.So it seems that SME and NBU are stepping over each other via VSS.

    HTH someone encountering the same issue.