05-05-2015 03:33 AM
Need help on NDMP failure with error code 84
ALL cifs and nfs streams running successfully. A job with the file names imagenow/osm_01.00001 got failed .What could be a reason for this is similar files from same policy are completed successfully.
05/04/2015 08:48:34 - Info nbjm (pid=28528) starting backup job (jobid=5930722) for client isilonbkp, policy PROD_NFS_IMAGENOW_osm_1, schedule DLY_INCR 05/04/2015 08:48:34 - Info nbjm (pid=28528) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=5930722, request id:{41A92AE4-F264-11E4-B86F-CF01C23B3EE4}) 05/04/2015 08:48:34 - requesting resource STU_DDR1_DALMEDIA3 05/04/2015 08:48:34 - requesting resource nbutx2.NBU_CLIENT.MAXJOBS.isilonbkp 05/04/2015 08:48:34 - requesting resource nbutx2.NBU_POLICY.MAXJOBS.PROD_NFS_IMAGENOW_osm_1 05/04/2015 08:48:34 - granted resource nbutx2.NBU_CLIENT.MAXJOBS.isilonbkp 05/04/2015 08:48:34 - granted resource nbutx2.NBU_POLICY.MAXJOBS.PROD_NFS_IMAGENOW_osm_1 05/04/2015 08:48:34 - granted resource MediaID=@aaae7;DiskVolume=ddr1_vw;DiskPool=DDR1_DP; Path=ddr1_vw;StorageServer=ddr1;MediaServer=dalmedia3 05/04/2015 08:48:34 - granted resource STU_DDR1_DALMEDIA3 05/04/2015 08:48:34 - estimated 37417771 kbytes needed 05/04/2015 08:48:34 - Info nbjm (pid=28528) started backup (backupid=isilonbkp_1430747314) job for client isilonbkp, policy PROD_NFS_IMAGENOW_osm_1, schedule DLY_INCR on storage unit STU_DDR1_DALMEDIA3 05/04/2015 08:48:34 - started process bpbrm (pid=12509) 05/04/2015 08:48:34 - connecting 05/04/2015 08:48:35 - connected; connect time: 0:00:00 05/04/2015 08:48:36 - Info bptm (pid=12560) start backup 05/04/2015 08:48:37 - begin writing 05/04/2015 13:25:36 - Critical bptm (pid=12560) image write failed: error 2060046: plugin error 05/04/2015 13:25:36 - Critical bptm (pid=12560) sts_get_image_prop failed: error 2060046: plugin error 05/04/2015 13:25:38 - Error bptm (pid=12560) cannot write image to disk, Invalid argument 05/04/2015 13:25:38 - Error ndmpagent (pid=12559) NDMP backup failed, path = /ifs/HMS/imagenowapp/imagenow/ImageNow6/osm_01.00001 05/04/2015 13:25:41 - Info ndmpagent (pid=12559) isilonbkp: Filetransfer: Transferred 12658688 bytes in 16622.546 seconds throughput of 0.744 KB/s 05/04/2015 13:25:41 - Info ndmpagent (pid=12559) isilonbkp: Filetransfer: Transferred 12658688 total bytes 05/04/2015 13:25:41 - Info ndmpagent (pid=12559) isilonbkp: CPU user=51.328257 sys=4683.290557 ft=16622.538847 cdb=0.000000 05/04/2015 13:25:41 - Info ndmpagent (pid=12559) isilonbkp: maxrss=114340 in=710112 out=57 vol=5126083 inv=3738523 05/04/2015 13:25:41 - Error ndmpagent (pid=12559) isilonbkp: Failed to write data at offset 8462336 05/04/2015 13:25:41 - Info bptm (pid=12560) EXITING with status 84 <---------- 05/04/2015 13:25:41 - Info ndmpagent (pid=0) done. status: 84: media write error 05/04/2015 13:25:41 - end writing; write time: 4:37:04 media write error (84)
Solved! Go to Solution.
05-05-2015 05:30 AM
Looks like you will need to increase logging level.
There is no indication of what went wrong with current level 0 logging.
These are the only entries in bptm for this job (PID 12560):
08:48:35.401 [12560] <2> bptm: INITIATING (VERBOSE = 0): -w -c isilonbkp -dpath ddr1_vw -stunit STU_DDR1_DALMEDIA3 -cl PROD_NFS_IMAGENOW_osm_1 -bt 1430747314 -b isilonbkp_1430747314 -st 1 -cj 1 -reqid -1429538155 -jm -brm -hostname isilonbkp -ru root -rclnt isilonbkp -rclnthostname isilonbkp -rl 1 -rp 1209600 -sl DLY_INCR -ct 19 -maxfrag 524288 -eari 0 -mediasvr dalmedia3 -connect_options 0x01030202 -jobid 5930722 -jobgrpid 5930721 -masterversion 760000 -bpbrm_shm_id 136019987 -blks_per_buffer 512 -ndmpport 48116 -df 3 08:48:36.758 [12560] <4> report_client: VBRC 2 12560 1 isilonbkp_1430747314 19 PROD_NFS_IMAGENOW_osm_1 1 DLY_INCR 0 1 1 08:48:36.877 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 0, resume_number 0, spl_name NULL 08:48:37.195 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 1, resume_number 0, spl_name NULL 08:48:37.284 [12560] <2> io_open_disk: file isilonbkp_1430747314_C1_F1 successfully opened 08:48:37.346 [12560] <4> write_backup: begin writing backup id isilonbkp_1430747314, copy 1, fragment 1, destination path ddr1_vw ..... 13:25:36.321 [12560] <2> delete_image_disk_sts_impl: Deleting disk header for isilonbkp_1430747314_C1_HDR
Strangely none of these entries for PID 12560 can be found in bptm log:
05/04/2015 13:25:36 - Critical bptm (pid=12560) image write failed: error 2060046: plugin error 05/04/2015 13:25:36 - Critical bptm (pid=12560) sts_get_image_prop failed: error 2060046: plugin error 05/04/2015 13:25:38 - Error bptm (pid=12560) cannot write image to disk, Invalid argument
It also seems that multiple other jobs ran successfully to DD and tape for the same NDMP client at the same time:
11:33:14.758 [13640] <4> write_backup: successfully wrote backup id isilonbkp_1430549662 11:55:25.479 [25889] <4> write_backup: successfully wrote backup id isilonbkp_1430549356 12:03:24.661 [1624] <4> write_backup: successfully wrote backup id isilonbkp_1430548954
How many backups are running simultaneously on the media server to tape and DD?
This maybe an indication of media server or DD overload.
Which OS and NBU version on the media server?
These touchfiles normally helps with process overload on MSDP media server.
See if this helps (Extract from http://www.symantec.com/docs/TECH156490 :(
Another TN (TECH159707) suggests these values:
DPS_PROXYDEFAULTSENDTMO (value of 1800 inside)
DPS_PROXYDEFAULTRECVTMO (value of 1800 inside)
I have also seen suggestions saying 800 in DPS_PROXYDEFAULTRECVTMO.
See if job is successful when you run less jobs to this media server.
05-05-2015 04:21 AM
All we can see is a write and plugin error.
The problem seems to be with backup destination, not the source.
What type of storage is STU_DDR1_DALMEDIA3?
Do you have log folders on MediaServer dalmedia3?
You will need bptm and bpdm log folders.
Please copy logs to .txt files (e.g. bptm.txt) and upload them as File attachments.
05-05-2015 04:45 AM
stu_ddr1_dalmedia3 is a data domain storage unit.
05-05-2015 04:46 AM
bpbrm logs as well
05-05-2015 05:30 AM
Looks like you will need to increase logging level.
There is no indication of what went wrong with current level 0 logging.
These are the only entries in bptm for this job (PID 12560):
08:48:35.401 [12560] <2> bptm: INITIATING (VERBOSE = 0): -w -c isilonbkp -dpath ddr1_vw -stunit STU_DDR1_DALMEDIA3 -cl PROD_NFS_IMAGENOW_osm_1 -bt 1430747314 -b isilonbkp_1430747314 -st 1 -cj 1 -reqid -1429538155 -jm -brm -hostname isilonbkp -ru root -rclnt isilonbkp -rclnthostname isilonbkp -rl 1 -rp 1209600 -sl DLY_INCR -ct 19 -maxfrag 524288 -eari 0 -mediasvr dalmedia3 -connect_options 0x01030202 -jobid 5930722 -jobgrpid 5930721 -masterversion 760000 -bpbrm_shm_id 136019987 -blks_per_buffer 512 -ndmpport 48116 -df 3 08:48:36.758 [12560] <4> report_client: VBRC 2 12560 1 isilonbkp_1430747314 19 PROD_NFS_IMAGENOW_osm_1 1 DLY_INCR 0 1 1 08:48:36.877 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 0, resume_number 0, spl_name NULL 08:48:37.195 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 1, resume_number 0, spl_name NULL 08:48:37.284 [12560] <2> io_open_disk: file isilonbkp_1430747314_C1_F1 successfully opened 08:48:37.346 [12560] <4> write_backup: begin writing backup id isilonbkp_1430747314, copy 1, fragment 1, destination path ddr1_vw ..... 13:25:36.321 [12560] <2> delete_image_disk_sts_impl: Deleting disk header for isilonbkp_1430747314_C1_HDR
Strangely none of these entries for PID 12560 can be found in bptm log:
05/04/2015 13:25:36 - Critical bptm (pid=12560) image write failed: error 2060046: plugin error 05/04/2015 13:25:36 - Critical bptm (pid=12560) sts_get_image_prop failed: error 2060046: plugin error 05/04/2015 13:25:38 - Error bptm (pid=12560) cannot write image to disk, Invalid argument
It also seems that multiple other jobs ran successfully to DD and tape for the same NDMP client at the same time:
11:33:14.758 [13640] <4> write_backup: successfully wrote backup id isilonbkp_1430549662 11:55:25.479 [25889] <4> write_backup: successfully wrote backup id isilonbkp_1430549356 12:03:24.661 [1624] <4> write_backup: successfully wrote backup id isilonbkp_1430548954
How many backups are running simultaneously on the media server to tape and DD?
This maybe an indication of media server or DD overload.
Which OS and NBU version on the media server?
These touchfiles normally helps with process overload on MSDP media server.
See if this helps (Extract from http://www.symantec.com/docs/TECH156490 :(
Another TN (TECH159707) suggests these values:
DPS_PROXYDEFAULTSENDTMO (value of 1800 inside)
DPS_PROXYDEFAULTRECVTMO (value of 1800 inside)
I have also seen suggestions saying 800 in DPS_PROXYDEFAULTRECVTMO.
See if job is successful when you run less jobs to this media server.
05-05-2015 05:32 AM
1) What version of OneFS are you running on the Isilon devices?
2) Does your model and version of Isilon appear in this list?
https://symwisedownload.symantec.com/resources/sites/SYMWISE/content/live/SOLUTIONS/76000/TECH76495/en_US/nbu_76_hcl.html?__gda__=1430935507_7668372624cef105a9bb22001e3d0229#ndmp_devices-ndmp_devices_-_vendor_compatibility-isilon
05-05-2015 04:54 PM
Most likely you need to engage DataDomain to troubleshoot as well.
Enable verbose=5 bptm logs & DebugLevel=6 ndmpagent (OID=134) logs, that should give you more details of what the exact error code of plugin is.
Similar issue: http://www.symantec.com/docs/TECH216451
Sometimes if the backup data is too large, OST plugin will timeout so on DataDomain side you need to increase timeout value in OST_ABANDON_TIMEOUT (Anyway, check with DataDomain first)