cancel
Showing results for 
Search instead for 
Did you mean: 

SLP fail on a specific mdsp

Aurelien59
Level 4

Hi all,

I have the impression that these SLP policies do not like me very much... Until recently they all worked well by correctly saving the images concerned by SLP policies. But recently, and without knowing why (reboot, loss of config, other?) I see status 191 errors appear for backup replications.

I don't know how to explain it better, so I will simply copypasta job details of one of the multiples clients replication backup that fails :

24 oct. 2022 13:35:37 - requesting resource LCM_*Remote*Master*:AI-REP-ARCHIVES
24 oct. 2022 13:35:37 - granted resource  LCM_*Remote*Master*:AI-REP-ARCHIVES
24 oct. 2022 13:35:41 - started process RUNCMD (pid=22208)
24 oct. 2022 13:35:42 - Info nbreplicate (pid=22208) Suspend window close behavior is not supported for nbreplicate
24 oct. 2022 13:35:42 - Info nbreplicate (pid=22208) window close behavior: Continue processing the current image
24 oct. 2022 13:35:47 - requesting resource @aaaar
24 oct. 2022 13:35:47 - reserving resource @aaaar
24 oct. 2022 13:35:49 - resource @aaaar reserved
24 oct. 2022 13:35:49 - granted resource  MediaID=@aaaar;DiskVolume=PureDiskVolume;DiskPool=Disk_Pool_REP;Path=PureDiskVolume;StorageServer=srv-nbumed01-ai.process.dkm;MediaServer=srv-nbumed01-ai.process.dkm
24 oct. 2022 13:35:52 - Info bpdm (pid=9604) started
24 oct. 2022 13:35:52 - started process bpdm (pid=9604)
24 oct. 2022 13:36:01 - Info srv-nbumed01-ai.process.dkm (pid=9604) StorageServer=PureDisk:srv-nbumed01-ai.process.dkm; Report=PDDO Stats for (srv-nbumed01-ai.process.dkm): scanned: 5 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
24 oct. 2022 13:36:39 - Info srv-nbumed01-ai.process.dkm (pid=9604) Using OpenStorage to replicate backup id PC-AILOG12147_1666082526, media id @aaaar, storage server srv-nbumed01-ai.process.dkm, disk volume PureDiskVolume
24 oct. 2022 13:36:40 - Info srv-nbumed01-ai.process.dkm (pid=9604) Replicating images to target storage server srvv-nbu1-rep.process.dkm, disk volume PureDiskVolume
24 oct. 2022 13:37:00 - Critical bpdm (pid=9604) Storage Server Error: (Storage server: PureDisk:srv-nbumed01-ai.process.dkm) CALaunchAIRReplicate: Failed to complete launchAIRReplicate webservice (Could not setup replication: get Remote SPA ( srvv-nbu1-rep.process.dkm ) webservice failed, could not determine whether target is PDDE or PDDO (software caused connection abort) ) V-454-61
24 oct. 2022 13:37:00 - Error bpdm (pid=9604) <async> copy image failed: error 2060017: system call failed
24 oct. 2022 13:37:00 - Error bpdm (pid=9604) copy failed: error 174
24 oct. 2022 13:37:00 - Error bpdm (pid=9604) <async> cancel failed: error 2060001: one or more invalid arguments
24 oct. 2022 13:37:00 - Error bpdm (pid=9604) copy cancel failed: error 174
24 oct. 2022 13:37:01 - Info srv-nbumed01-ai.process.dkm (pid=9604) StorageServer=PureDisk:srv-nbumed01-ai.process.dkm; Report=PDDO Stats for (srv-nbumed01-ai.process.dkm): scanned: 4 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
24 oct. 2022 13:37:01 - Error nbreplicate (pid=22208) ReplicationJob::Replicate: Replication failed for backup id PC-AILOG12147_1666082526: media write error (84)
24 oct. 2022 13:37:01 - Replicate failed for backup id PC-AILOG12147_1666082526 with status 84
no images were successfully processed  (191)

 I'm sure there will be information missing here, so don't hesitate to tell me which logs it would be interesting to share with you and how to export them please.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Hi all

Seems that the issue was due to network problems... Our network team did some mistakes without giving us the good informations. Replication server was not able to join mdsp because of new rules on firewalls. I should have test ping both ways.

 

View solution in original post

2 REPLIES 2

davidmoline
Level 6
Employee

Hi @Aurelien59 

You don't mention what NetBackup version, but if you have recently upgraded to post 8.1 version this article may be applicable: https://www.veritas.com/content/support/en_US/article.100044606.html

In summary it is possbile the secure comms process has gone awry and the solution is to re-establish the trust between master server and the relevant MSDP storage servers. 

If this doesn't help, can you provide more details on the NetBackup version of the source and destination of the replication job.

Cheers
David

Hi all

Seems that the issue was due to network problems... Our network team did some mistakes without giving us the good informations. Replication server was not able to join mdsp because of new rules on firewalls. I should have test ping both ways.