07-09-2018 10:33 AM
hi eperts,
i do have multiple duplication job runs between 2 DD's, one in my DC and another DD in remote site. i have multiple client machine data getting duplicated between this DD's using SLP with out any issues. but one client machine backup alone gives some trouble during duplication.
when duplication starts, it runs for several hours and copies data over 1TB and fails with 191. even now i have running duplication gives trouble when it runs duplication for this one specific client machine data. see below detailed status of the job and do the needful.
also attached a screenshot for refernce
07/06/2018 02:39:32 - requesting resource LCM_brsnsdd6300_1-stu
07/06/2018 02:39:34 - granted resource LCM_brsnsdd6300_1-stu
07/06/2018 02:39:34 - started process RUNCMD (pid=9592)
07/06/2018 02:39:34 - ended process 0 (pid=9592)
07/06/2018 02:39:35 - begin Duplicate
07/06/2018 02:39:35 - requesting resource brsnsdd6300_1-stu
07/06/2018 02:39:35 - reserving resource @aaabm
07/06/2018 02:39:37 - resource @aaabm reserved
07/06/2018 02:39:37 - granted resource MediaID=@aaabo;DiskVolume=BRS_NEW_DD;DiskPool=brsnsdd6300_1-dp;Path=BRS_NEW_DD;StorageServer=brsnsdd6300_1.na.ad.:smileyhappy:.com;MediaServer=mhlwabkpms02p.na.ad.:smileyhappy:.com
07/06/2018 02:39:37 - granted resource brsnsdd6300_1-stu
07/06/2018 02:39:38 - Info Duplicate (pid=9592) Initiating optimized duplication from @aaabm to @aaabo
07/06/2018 02:39:38 - requesting resource @aaabm
07/06/2018 02:39:38 - granted resource MediaID=@aaabm;DiskVolume=MHL_NEW_DD_1;DiskPool=mhlnsdd6300_1-DP;Path=MHL_NEW_DD_1;StorageServer=MHLNSDD6300_1.na.ad.:cathappy:.com;MediaServer=mhlwabkpms02p.na.ad.:cathappy:.com
07/06/2018 02:39:39 - Info bpduplicate (pid=9592) Suspend window close behavior is not supported for optimized duplications
07/06/2018 02:39:39 - Info bpduplicate (pid=9592) window close behavior: Continue processing the current image
07/06/2018 02:39:42 - Info bpdm (pid=26300) started
07/06/2018 02:39:42 - started process bpdm (pid=26300)
07/06/2018 02:39:44 - Info bpdm (pid=26300) requesting nbjm for media
07/06/2018 02:39:48 - begin writing
07/06/2018 05:49:30 - end writing; write time: 3:09:42
07/06/2018 05:49:31 - begin writing
07/06/2018 13:37:17 - end writing; write time: 7:47:46
07/06/2018 13:37:18 - begin writing
07/07/2018 01:12:54 - Critical bpdm (pid=26300) sts_copy_extent failed: error 2060046 plugin error
07/07/2018 01:12:54 - Critical bpdm (pid=26300) image copy failed: error 2060046: plugin error
07/07/2018 01:12:54 - Error bpdm (pid=26300) cannot copy image from disk, bytesCopied = 18446744073709551615
07/07/2018 01:12:54 - Critical bpdm (pid=26300) sts_get_image_prop failed: error 2060046: plugin error
07/07/2018 01:12:54 - Critical bpdm (pid=26300) Invalid storage device: BRS_NEW_DD file not found
07/07/2018 01:12:55 - Critical bpdm (pid=26300) Invalid image copy for mhlvassql01_1530835561_C3_F1_R1: error 2060018
07/07/2018 01:12:56 - Error bpduplicate (pid=9592) host mhlwabkpms02p.na.ad.crbard.com backup id mhlvassql01_1530835561 optimized duplication failed, media manager - system error occurred (174).
07/07/2018 01:12:57 - Error bpduplicate (pid=9592) Duplicate of backupid mhlvassql01_1530835561 failed, media manager - system error occurred (174).
07/07/2018 01:12:57 - Error bpduplicate (pid=9592) Status = no images were successfully processed.
07/07/2018 01:12:57 - end Duplicate; elapsed time 22:33:22
no images were successfully processed (191)
07-09-2018 01:34 PM - edited 07-09-2018 01:35 PM
07-09-2018 02:08 PM
hi, pls find below.
Note : we have other backup images duplicating well and good between this 2 DD's. only few client machine backup images are not getting duplicated, also this images are little big in size. even this images do run for more than 20hrs and transfers over 1TB of data. but after sometimes, it ends up with an error as mentioned in first post.
07-10-2018 01:24 AM
Have you tried to check the DD logs for 7 July on the target Storage Server brsnsdd6300_1?
It almost seems as if the remote storage stopped responding:
Invalid storage device: BRS_NEW_DD file not found
Invalid image copy for mhlvassql01_1530835561_C3_F1_R1: error 2060018
All errors point to DD plugin error :
sts_copy_extent failed: error 2060046 plugin error
The image size mentioned in the job details seems excessive :
cannot copy image from disk, bytesCopied = 18446744073709551615
Is this really the size? Can you verify the image on the source?
Is there any way that the backup can be broken up into smaller 'chuncks'?
Logging needed for troubleshooting at NBU end:
bptm and bpdm on source and destination media servers. I suggest level 3 logs.
07-10-2018 09:08 AM
hi Marianne,
i dont find any abnormal error logs reported in my source or target DD. as other jobs over 1TB in size completes well at the same time, though it takes time because of WAN link.
i am not sure why the image size in detailed status show that big in size, the actual size of an image specific to that client machine is not more than 1.6TB. pls find below examples.
C:\Program Files\Veritas\NetBackup\bin\admincmd>bpimagelist.exe -backupid mhlvas
sql01_1530576498 -U
Backed Up Expires Files KB C Sched Type On Hold Polic
y
---------------- ---------- -------- -------- - --------------- ------- -----
-------
07/02/2018 20:08 07/23/2018 1335 1663914265 N Cumulative Incr 0 MHL
_VMware_Replicated_Daily_SQL
C:\Program Files\Veritas\NetBackup\bin\admincmd>bpimagelist.exe -backupid mhlvas
sql01_1530144532 -U
Backed Up Expires Files KB C Sched Type On Hold Polic
y
---------------- ---------- -------- -------- - --------------- ------- -----
-------
06/27/2018 20:08 07/18/2018 1804 1656929202 N Cumulative Incr 0 MHL
_VMware_Replicated_Daily_SQL
since we are covering VM level backup for specific client, hence breaking up to smaller chunks is not possible.
as of now i could see the duplication job for this client is running over 45hrs in my activity monitor screen, i suspect it is running good, but not sure if it will get success or fail after some times later. i will keep tracking the same and post the details of it, if it fails.
07-10-2018 02:01 PM
Log on to you DD and run options (to display details) for command - ddboost file-replication (eg detailed-file-history) and see if you can find the details about the backup image is replicated or in progress or failed.
Can you also check Status of File base replication in console: Replication -> DD Boost -> File Replication; is there any pending file replication?? There may be current active jobs that may show pending but look specifically for the images that failing to duplicate.
07-10-2018 02:31 PM
hi,
upon checking in file replication tab of my target DD -> i could see all the backup image ID's are in success state, except for current running duplication jobs. no specific error listed in the tab which you mentioned.
07-10-2018 02:34 PM
Check the Stats and performance of your file replication, you will get idea if it's working as expected. You can use the same command or GUI to check that details.
Wait for a while and check the DD and bpdm logs on media server if the duplication fails again, you should get more details.