cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

Netbackup 7.5 with Exchange DAG 2010

karmel
Level 4

Hello Experts,

i am facing an issue with exchange backup,

I am running Exchange 2010, DAG on windows 2008 R2

Master/Media are running on the same box, solaris 10 with Netbackuo v 7.5, backup stores data to a Basic Disk i have created on the Master/Media server, exchange database size is small (3GB).

Checkpoint FW is in between the Master/Media and the exchange.

------------------------------------------------

the backup starts just fine, couple of snapshot jobs succeeded as per below logs

==================

09/21/2012 18:01:10 - Info nbjm (pid=27210) starting backup job (jobid=1287) for client dag-01.example.com, policy MBX2, schedule SRV-DAG
09/21/2012 18:01:10 - Info nbjm (pid=27210) requesting MEDIA_SERVER_ONLY resources from RB for backup job (jobid=1287, request id:{2DFAED00-03FD-11E2-B9C6-0003BA64CAA7})
09/21/2012 18:01:10 - requesting resource Exch-Basic
09/21/2012 18:01:10 - requesting resource SRV-BKP-01.NBU_CLIENT.MAXJOBS.dag-01.example.com
09/21/2012 18:01:10 - requesting resource SRV-BKP-01.NBU_POLICY.MAXJOBS.MBX2
09/21/2012 18:01:10 - requesting resource EXCHANGE_RESOLVER.SRV-BKP-01.MBX2.dag-01.example.com
09/21/2012 18:01:10 - granted resource  SRV-BKP-01.NBU_CLIENT.MAXJOBS.dag-01.example.com
09/21/2012 18:01:10 - granted resource  SRV-BKP-01.NBU_POLICY.MAXJOBS.MBX2
09/21/2012 18:01:10 - granted resource  EXCHANGE_RESOLVER.SRV-BKP-01.MBX2.dag-01.example.com
09/21/2012 18:01:11 - estimated 0 kbytes needed
09/21/2012 18:01:11 - begin Parent Job
09/21/2012 18:01:11 - Info RUNCMD (pid=27721) started
09/21/2012 18:01:11 - Info RUNCMD (pid=27721) exiting with status: 0
Operation Status: 0
09/21/2012 18:01:11 - end Parent Job; elapsed time 0:00:00
Operation Status: 0
Operation Status: 0
09/21/2012 18:01:12 - started process bpbrm (pid=27727)
09/21/2012 18:01:31 - Info bpbrm (pid=27727) dag-01.example.com is the host to restore to
09/21/2012 18:01:31 - Info bpbrm (pid=27727) reading file list from client
09/21/2012 18:03:58 - Info bpbrm (pid=27727) client_pid=11136
09/21/2012 18:03:58 - Info bpbrm (pid=27727) from client dag-01.example.com: TRV - BPRESOLVER has executed on server (MBX-01)
09/21/2012 18:05:58 - Info bpresolver (pid=11136) done.  status: 0
09/21/2012 18:05:58 - Info bpresolver (pid=11136) done. status: 0: the requested operation was successfully completed
Operation Status: 0
Operation Status: 0
Operation Status: 24
Operation Status: 0
09/21/2012 18:33:35 - Info RUNCMD (pid=1644) started
09/21/2012 18:33:35 - Info RUNCMD (pid=1644) exiting with status: 0
Operation Status: 0
Operation Status: 24
socket write failed  (24)

=======================

the backup job starts and store date to the Disk, going upto 1 GB of data and then, stopped and throw this error,

09/21/2012 18:10:08 - Info nbjm (pid=27210) starting backup job (jobid=1289) for client MBX-01.example.com, policy MBX2, schedule SRV-DAG
09/21/2012 18:10:09 - estimated 0 kbytes needed
09/21/2012 18:10:09 - Info nbjm (pid=27210) started backup (backupid=dag-01.example.com_1348240208) job for client MBX-01.example.com, policy MBX2, schedule SRV-DAG on storage unit Exch-Basic
09/21/2012 18:10:10 - started process bpbrm (pid=28817)
09/21/2012 18:10:54 - Info bpbrm (pid=28817) MBX-01.example.com is the host to backup data from
09/21/2012 18:10:54 - Info bpbrm (pid=28817) reading file list from client
09/21/2012 18:11:13 - connecting
09/21/2012 18:11:53 - Info bpbrm (pid=28817) starting bpbkar on client
09/21/2012 18:11:53 - connected; connect time: 0:00:00
09/21/2012 18:12:42 - Info bpbkar (pid=9268) Backup started
09/21/2012 18:12:42 - Info bpbrm (pid=28817) bptm pid: 29100
09/21/2012 18:12:43 - Info bptm (pid=29100) start
09/21/2012 18:12:43 - Info bptm (pid=29100) using 262144 data buffer size
09/21/2012 18:12:43 - Info bptm (pid=29100) using 30 data buffers
09/21/2012 18:12:46 - Info bptm (pid=29100) start backup
09/21/2012 18:12:47 - Info bptm (pid=29100) backup child process is pid 29112
09/21/2012 18:12:47 - begin writing
09/21/2012 18:30:42 - Critical bpbrm (pid=28817) from client dag-01.example.com: FTL - socket write failed
09/21/2012 18:30:42 - Error bptm (pid=29112) system call failed - Connection reset by peer (at child.c.1298)
09/21/2012 18:30:42 - Error bptm (pid=29112) unable to perform read from client socket, connection may have been broken
09/21/2012 18:30:42 - Error bptm (pid=29100) media manager terminated by parent process
09/21/2012 18:30:48 - Error bpbrm (pid=28817) could not send server status message
09/21/2012 18:31:08 - Info bpbkar (pid=9268) done. status: 24: socket write failed
09/21/2012 18:31:08 - end writing; write time: 0:18:21
socket write failed  (24)

======================================================

on the client, the exchange node, the bpbkar logs is showing

  Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321
  Module: @(#) $Source: src/ncf/tfi/lib/Packer.cpp,v $ $Revision: 1.90 $ , Function: Packer::getBuffer(), Line: 658
  Module: tar_tfi::getBuffer, Function: D:\NB\NB_7.5\src\cl\clientpc\util\tar_tfi.cpp, Line: 311
  Local Address: [::]:0
  Remote Address: [::]:0
  OS Error: 10060 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
)
  Expected bytes: 131072

6:16:57.957 PM: [9268.956] <2> tar_base::V_vTarMsgW: FTL - socket write failed
6:16:57.957 PM: [9268.956] <4> tar_backup::backup_done_state: INF - number of file directives not found: 0
6:16:57.957 PM: [9268.956] <4> tar_backup::backup_done_state: INF -     number of file directives found: 5
6:16:57.957 PM: [9268.724] <4> tar_base::keepaliveThread: INF - keepalive thread terminating (reason: WAIT_OBJECT_0)
6:16:57.957 PM: [9268.956] <4> tar_base::stopKeepaliveThread: INF - keepalive thread has exited. (reason: WAIT_OBJECT_0)
6:16:57.957 PM: [9268.956] <2> tar_base::V_vTarMsgW: INF - EXIT STATUS 24: socket write failed
6:16:57.957 PM: [9268.956] <4> tar_backup::backup_done_state: INF - Not waiting for server status
6:16:57.957 PM: [9268.956] <4> dos_backup::tfs_reset: INF - Snapshot deletion start
6:16:57.957 PM: [9268.956] <4> OVStopCmd: INF - EXIT - status = 0
6:16:57.957 PM: [9268.956] <2> tar_base::V_Close: closing...
6:16:57.957 PM: [9268.956] <4> dos_backup::tfs_reset: INF - Snapshot deletion start
6:16:57.957 PM: [9268.956] <2> ov_log::V_GlobalLog: INF - BEDS_Term(): enter - InitFlags:0x00000001
6:17:24.321 PM: [9268.956] <4> OVShutdown: INF - Finished process
6:17:24.336 PM: [9268.956] <4> WinMain: INF - Exiting C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe
6:17:26.364 PM: [9268.956] <4> ov_log::OVClose: INF - Closing log file: C:\Program Files\Veritas\NetBackup\logs\BPBKAR\092112.LOG

=================================================================================
 

2 REPLIES 2

karmel
Level 4
Could someone please help me ?

V4
Level 6
Partner Accredited

status 24 is mainly on network ... ask ur nework team to monitor simultaneously.

Also run SAS and share it with symantec team they'll get back to you with report on network loss if found in SAS output