Forum Discussion

ankur1809's avatar
ankur1809
Level 5
10 years ago

NDMP backup failure

Need help on NDMP failure with error code 84 ALL cifs and nfs streams running successfully. A job with the file names imagenow/osm_01.00001 got failed .What could be a reason for this is similar f...
  • Marianne's avatar
    10 years ago

    Looks like you will need to increase logging level.

    There is no indication of what went wrong with current level 0 logging. 
    These are the only entries in bptm for this job (PID 12560):

    08:48:35.401 [12560] <2> bptm: INITIATING (VERBOSE = 0): -w -c isilonbkp -dpath ddr1_vw -stunit STU_DDR1_DALMEDIA3 -cl PROD_NFS_IMAGENOW_osm_1 -bt 1430747314 -b isilonbkp_1430747314 -st 1 -cj 1 -reqid -1429538155 -jm -brm -hostname isilonbkp -ru root -rclnt isilonbkp -rclnthostname isilonbkp -rl 1 -rp 1209600 -sl DLY_INCR -ct 19 -maxfrag 524288 -eari 0 -mediasvr dalmedia3 -connect_options 0x01030202 -jobid 5930722 -jobgrpid 5930721 -masterversion 760000 -bpbrm_shm_id 136019987 -blks_per_buffer 512 -ndmpport 48116 -df 3
    08:48:36.758 [12560] <4> report_client: VBRC 2 12560 1 isilonbkp_1430747314 19 PROD_NFS_IMAGENOW_osm_1 1 DLY_INCR 0 1 1
    08:48:36.877 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 0, resume_number 0, spl_name NULL
    08:48:37.195 [12560] <2> construct_sts_isid: master_server nbutx2.hms.hmsy.com, client isilonbkp, backup_time 1430747314, copy_number 1, stream_number 1, fragment_number 1, resume_number 0, spl_name NULL
    08:48:37.284 [12560] <2> io_open_disk: file isilonbkp_1430747314_C1_F1 successfully opened
    08:48:37.346 [12560] <4> write_backup: begin writing backup id isilonbkp_1430747314, copy 1, fragment 1, destination path ddr1_vw
    .....
    13:25:36.321 [12560] <2> delete_image_disk_sts_impl: Deleting disk header for isilonbkp_1430747314_C1_HDR

    Strangely none of these entries for PID 12560 can be found in bptm log:

    05/04/2015 13:25:36 - Critical bptm (pid=12560) image write failed: error 2060046: plugin error 
    05/04/2015 13:25:36 - Critical bptm (pid=12560) sts_get_image_prop failed: error 2060046: plugin error 
    05/04/2015 13:25:38 - Error bptm (pid=12560) cannot write image to disk, Invalid argument 

    It also seems that multiple other jobs ran successfully to DD and tape for the same NDMP client at the same time:

    11:33:14.758 [13640] <4> write_backup: successfully wrote backup id isilonbkp_1430549662 
    11:55:25.479 [25889] <4> write_backup: successfully wrote backup id isilonbkp_1430549356 
    12:03:24.661 [1624] <4> write_backup: successfully wrote backup id isilonbkp_1430548954
    

     

    How many backups are running simultaneously on the media server to tape and DD?

    This maybe an indication of media server or DD overload.

    Which OS and NBU version on the media server?

    These touchfiles normally helps with process overload on MSDP media server. 
    See if this helps (Extract from http://www.symantec.com/docs/TECH156490 ):

    On Windows media server. 
    -------------------------- 
    Create this empty file. 
    <install path>\netbackup\db\config\DPS_PROXYNOEXPIRE 
    Be certain that there are no extensions on the file. For example, no .txt extension. 
     
    Create these two files and put inside it the value 300. 
    The 300 value is the only thing inside each file 
    <install path>\netbackup\db\config\DPS_PROXYDEFAULTSENDTMO 
    <install path>\netbackup\db\config\DPS_PROXYDEFAULTRECVTMO 
     
    Restart nbrmms (NetBackup Remote Manager and Monitor Service) on the media server, 
        or just stop and restart all services. 
    <install path>\netbackup\bin\bpdown -f -v 
    <install path>\netbackup\bin\bpup -f -v 
     
    On UNIX or Linux: 
    -------------------------- 
    # touch /usr/openv/netbackup/db/config/DPS_PROXYNOEXPIRE 
    # echo "300" > /usr/openv/netbackup/db/config/DPS_PROXYDEFAULTSENDTMO 
    # echo "300" > /usr/openv/netbackup/db/config/DPS_PROXYDEFAULTRECVTMO 
     
    Restart nbrmms (NetBackup Remote Manager and Monitor Service) on the media server. 
    # pkill nbrmms 
    # /usr/openv/netbackup/bin/nbrmms 
        Or, stop and restart all services on the MSDP media server. 
    # /usr/openv/netbackup/bin/goodies/netbackup stop 
    # /usr/openv/netbackup/bin/goodies/netbackup start 
     

     Another TN (TECH159707) suggests these values:

    DPS_PROXYDEFAULTSENDTMO (value of 1800 inside)
    DPS_PROXYDEFAULTRECVTMO (value of 1800 inside)

    I have also seen suggestions saying 800 in DPS_PROXYDEFAULTRECVTMO.

    See if job is successful when you run less jobs to this media server.