02-22-2024 09:08 PM
Good day
Kindly assist me with below issue.
For some days now all my NetBackup replication jobs keeps failing with status code 191. In NetBackup admin console at activity monitor, when i try to restart job (to retry the replication i don't see option to retry all i see there is grayed out.
NetBackup Ver=9.1
I will greatly appreciate your support, please find below log message.
Feb 22, 2024 3:06:13 PM - requesting resource LCM_KS-BKP-MED02.abc.root.net
Feb 22, 2024 3:06:13 PM - granted resource LCM_KS-BKP-MED02.abc.root.net
Feb 22, 2024 3:06:14 PM - Info nbreplicate (pid=14456) Suspend window close behavior is not supported for nbreplicate
Feb 22, 2024 3:06:14 PM - Info nbreplicate (pid=14456) window close behavior: Continue processing the current image
Feb 22, 2024 3:06:14 PM - started process RUNCMD (pid=14456)
Feb 22, 2024 3:06:14 PM - requesting resource @aaaad
Feb 22, 2024 3:06:14 PM - reserving resource @aaaad
Feb 22, 2024 3:06:15 PM - resource @aaaad reserved
Feb 22, 2024 3:06:15 PM - granted resource MediaID=@aaaad;DiskVolume=PureDiskVolume;DiskPool=DP_PD_MED02;Path=PureDiskVolume;StorageServer=ac1-bkp-med02.abc.root.net;MediaServer=ac1-bkp-med03.abc.root.net
Feb 22, 2024 3:06:16 PM - Info bpdm (pid=1544) started
Feb 22, 2024 3:06:16 PM - started process bpdm (pid=1544)
Feb 22, 2024 3:06:22 PM - Info ac1-bkp-med03.abc.root.net (pid=1544) StorageServer=PureDisk:ac1-bkp-med02.abc.root.net; Report=PDDO Stats for (ac1-bkp-med02.abc.root.net): scanned: 4 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled, where dedup space saving:75.0%, compression space saving:25.0%
Feb 22, 2024 3:06:25 PM - Info ac1-bkp-med03.abc.root.net (pid=1544) Using OpenStorage to replicate backup id ghasss1_1708465505, media id @aaaad, storage server ac1-bkp-med02.abc.root.net, disk volume PureDiskVolume
Feb 22, 2024 3:06:26 PM - Info ac1-bkp-med03.abc.root.net (pid=1544) Replicating images to target storage server KS-BKP-MED02.abc.root.net, disk volume PureDiskVolume
Feb 22, 2024 3:17:01 PM - Critical bpdm (pid=1544) Storage Server Error: (Storage server: PureDisk:ac1-bkp-med02.abc.root.net) async_get_job_status: Replication started but failed to complete successfully: __sosend: _crStreamWrite failed to send crc: connection reset by peer. Look at the replication logs on the source storage server for more information. V-454-105
Feb 22, 2024 3:17:01 PM - Error bpdm (pid=1544) <async> wait failed: error 2060014: operation aborted
Feb 22, 2024 3:17:01 PM - Error bpdm (pid=1544) wait failed: error 150
Feb 22, 2024 3:17:01 PM - Error bpdm (pid=1544) <async> cancel failed: error 2060001: one or more invalid arguments
Feb 22, 2024 3:17:01 PM - Error bpdm (pid=1544) copy cancel failed: error 174
Feb 22, 2024 3:17:02 PM - Info ac1-bkp-med03.abc.root.net (pid=1544) StorageServer=PureDisk:ac1-bkp-med02.abc.root.net; Report=PDDO Stats for (ac1-bkp-med02.abc.root.net): scanned: 4 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled, where dedup space saving:0.0%, compression space saving:100.0%
Feb 22, 2024 3:17:02 PM - Error nbreplicate (pid=14456) ReplicationJob::Replicate: Replication failed for backup id ghasss1_1708465505: media write error (84)
Feb 22, 2024 3:17:02 PM - Replicate failed for backup id ghasss1_1708465505 with status 84
no images were successfully processed (191)
Thank you
02-23-2024 01:15 AM
Hi
On given storage server - source - enable verbose logging (in bp.conf add VERBOSE = 5), create the log directories by executing mklogdir file located in /usr/openv/netbackup/logs
wait for the job to be restarted - if this is SLP based or kick it off again!
once failed review the bpdm logs and maybe few more - depening what will be in first one.
02-23-2024 03:24 AM
also check below:
- services status on destintation storage server (restart if possible)
- check connectivity over ports 10102 & 10082 between targer and source
replication.log would be helpful as well under spoold folder on source.
02-23-2024 05:53 AM
Thanks Quebek,
This is a windows server 2016, where do i enable verbose logging in windows (storage server)
bptestbpcd from master to client is ok
same to client is ok
I have restarted the service but still
02-23-2024 05:54 AM
Hello Hamza_H
have restarted same
telnet at both ends is also fine but still with error