Re: replication to remote master failing

noneatnoone · ‎04-18-2019

Running version 8 on Netbackup and it's failing -

Where is the best place to start investigating this? the remote master is in Azure, it was working up to a few weeks ago and the remote master is running fine and no space issues.

18-Apr-2019 11:25:38 AM - requesting resource LCM_*Remote*Master*
18-Apr-2019 11:25:38 AM - granted resource  LCM_*Remote*Master*
18-Apr-2019 11:25:39 AM - Info nbreplicate (pid=221110) Suspend window close behavior is not supported for nbreplicate
18-Apr-2019 11:25:39 AM - Info nbreplicate (pid=221110) window close behavior: Continue processing the current image
18-Apr-2019 11:25:39 AM - started process RUNCMD (pid=221110)
18-Apr-2019 11:25:40 AM - requesting resource @aaaag
18-Apr-2019 11:25:40 AM - reserving resource @aaaag
18-Apr-2019 11:25:40 AM - resource @aaaag reserved
18-Apr-2019 11:25:40 AM - granted resource  MediaID=@aaaag;DiskVolume=PureDiskVolume;DiskPool=dp_disk_netbackup;Path=PureDiskVolume;StorageServer=netbackup;MediaServer=netbackup
18-Apr-2019 11:25:41 AM - Info bpdm (pid=221132) started
18-Apr-2019 11:25:41 AM - started process bpdm (pid=221132)
18-Apr-2019 11:25:44 AM - Info netbackup (pid=221132) Using OpenStorage to replicate backup id dbe01_1555574752, media id @aaaag, storage server netbackup, disk volume PureDiskVolume
18-Apr-2019 11:25:44 AM - Info netbackup (pid=221132) Replicating images to target storage server netbackupdr, disk volume PureDiskVolume
18-Apr-2019 11:41:46 AM - Critical bpdm (pid=221132) Storage Server Error: (Storage server: PureDisk:netbackup) async_get_job_status: Replication started but failed to complete successfully:  __process_batch: CRStoreDO failed: spool directory out of space or tlog files in spool directory are too old, DO fingerprint 741860d59047dbe4a088edfbfa30b62b. Look at the replication logs on the source storage server for more information. V-454-105
18-Apr-2019 11:41:46 AM - Error bpdm (pid=221132) wait failed: error 150
18-Apr-2019 11:41:46 AM - Error bpdm (pid=221132) <async> cancel failed: error 2060001: one or more invalid arguments
18-Apr-2019 11:41:46 AM - Error bpdm (pid=221132) copy cancel failed: error 174
18-Apr-2019 11:41:46 AM - Info netbackup (pid=221132) StorageServer=PureDisk:netbackup; Report=PDDO Stats for (netbackup): scanned: 4 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
18-Apr-2019 11:41:46 AM - Error nbreplicate (pid=221110) ReplicationJob::Replicate: Replication failed for backup id dbe01_1555574752: media write error (84)
18-Apr-2019 11:41:46 AM - Replicate failed for backup id dbe01_1555574752 with status 84
no images were successfully processed  (191)

Amol_Nair · ‎04-18-2019

To start with could you verify if the MSDP storage on the target site has enough disk space available.

It may either the FULL or above the high water mark level that you have set.

The error seen in the job details you have shared shows the below messages

18-Apr-2019 11:25:44 AM - Info netbackup (pid=221132) Replicating images to target storage server netbackupdr, disk volume PureDiskVolume
18-Apr-2019 11:41:46 AM - Critical bpdm (pid=221132) Storage Server Error: (Storage server: PureDisk:netbackup) async_get_job_status: Replication started but failed to complete successfully:  __process_batch: CRStoreDO failed: spool directory out of space or tlog files in spool directory are too old, DO fingerprint 741860d59047dbe4a088edfbfa30b62b. Look at the replication logs on the source storage server for more information. V-454-105

If disk space is available on the target storage then we would need to go with logs to get more insight on the issue.

noneatnoone · ‎04-25-2019

Thanks for the reply.

MSDP is 322 GB free of 3.99 TB currently.

How could I go about cleaning that up? if it is the issue that is.

What logs do I need to retrieve/look at for the replication?

Thank you

VOX

replication to remote master failing