09-05-2018 05:23 AM
Hi.
Anybody experieced a random replication issue? The replication is running for a while and than start to fail with the following output:
30.jul.2018 14:03:43 - requesting resource LCM_*Remote*Master*:from_c*****k001-s*****k001
30.jul.2018 14:03:43 - granted resource LCM_*Remote*Master*:from_c*****k001-s*****k001
30.jul.2018 14:03:44 - Info nbreplicate (pid=2644) Suspend window close behavior is not supported for nbreplicate
30.jul.2018 14:03:44 - started process RUNCMD (pid=2644)
30.jul.2018 14:03:44 - Info nbreplicate (pid=2644) window close behavior: Continue processing the current image
30.jul.2018 14:03:44 - requesting resource @aaaao
30.jul.2018 14:03:44 - reserving resource @aaaao
30.jul.2018 14:03:45 - resource @aaaao reserved
30.jul.2018 14:03:45 - granted resource MediaID=@aaaao;DiskVolume=PureDiskVolume;DiskPool=dp_disk_c*****k001;Path=PureDiskVolume;StorageServer=c*****k001;MediaServer=c*****k001
30.jul.2018 14:03:48 - Info bpdm (pid=27791) started
30.jul.2018 14:03:48 - started process bpdm (pid=27791)
30.jul.2018 14:03:53 - Info c*****k001 (pid=27791) StorageServer=PureDisk:c*****k001; Report=PDDO Stats for (c*****k001): scanned: 5 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
30.jul.2018 14:03:55 - Info c*****k001 (pid=27791) Using OpenStorage to replicate backup id SO*****B001_1532883655, media id @aaaao, storage server c*****k001, disk volume PureDiskVolume
30.jul.2018 14:03:55 - Info c*****k001 (pid=27791) Replicating images to target storage server s*****k001, disk volume PureDiskVolume
30.jul.2018 14:04:27 - Critical bpdm (pid=27791) Storage Server Error: (Storage server: PureDisk:c*****k001) CACompleteAIRReplicate: Failed to complete completeAIRReplicate webservice (Could not send replication complete event: Error sending replication event. Check to ensure events are being processed correctly on target. (too many open files) ) V-454-60
30.jul.2018 14:04:27 - Error bpdm (pid=27791) <async> wait failed: error 2060043: no events available
30.jul.2018 14:04:27 - Error bpdm (pid=27791) wait failed: error 174
30.jul.2018 14:04:28 - Error bpdm (pid=27791) <async> cancel failed: error 2060001: one or more invalid arguments
30.jul.2018 14:04:28 - Error bpdm (pid=27791) copy cancel failed: error 174
30.jul.2018 14:04:28 - Info c*****k001 (pid=27791) StorageServer=PureDisk:c*****k001; Report=PDDO Stats for (c*****k001): scanned: 39655228 KB, CR sent: 29462 KB, CR sent over FC: 0 KB, dedup: 99.9%, cache disabled
30.jul.2018 14:04:28 - Error nbreplicate (pid=2644) ReplicationJob::Replicate: Replication failed for backup id SO*****B001_1532883655: media write error (84)
30.jul.2018 14:04:28 - Replicate failed for backup id SO*****B001_1532883655 with status 84
09-05-2018 06:01 AM
Have you seen this error?
(too many open files)
Please check OS-level limits in /etc/security/limits.conf
Check source and target Storage Servers.
PS:
is this the current date on your NBU master server?
30.jul.2018
09-05-2018 06:47 AM
Master server is Windows Server 2012 R2 Standard (Netbackup 8.1), media appliance 5330, media server UX
The time from the log is not current time, I took an older log. Actual time on the server is OK and synced.