cancel
Showing results for 
Search instead for 
Did you mean: 

NetBackup deduplication backup jobs may fail with status 87

oecheverry
Level 1

Hello,

It is an Oracle type policy (RMAN).
one or more of the child jobs fail with status code 87.

There are no errors on the disk or in the MSDP operating system.

Any ideas on this problem?

OmarE.

-.---

Sep 25, 2017 2:51:18 PM - Info nbjm (pid=1167) starting backup job (jobid=183534) for client srv0rac-scan, policy Pol_srv_Produccion, schedule Default-Application-Backup
Sep 25, 2017 2:51:18 PM - Info nbjm (pid=1167) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=183534, request id:{E574B314-A22A-11E7-96ED-00219B8D98D4})
Sep 25, 2017 2:51:18 PM - requesting resource msdp_srv0media-nb-stu
Sep 25, 2017 2:51:18 PM - requesting resource sv0oas.NBU_CLIENT.MAXJOBS.srv0rac-scan
Sep 25, 2017 2:51:18 PM - requesting resource sv0oas.NBU_POLICY.MAXJOBS.Pol_srv_Produccion
Sep 25, 2017 2:51:18 PM - granted resource sv0oas.NBU_CLIENT.MAXJOBS.srv0rac-scan
Sep 25, 2017 2:51:18 PM - granted resource sv0oas.NBU_POLICY.MAXJOBS.Pol_srv_Produccion
Sep 25, 2017 2:51:18 PM - granted resource MediaID=@aaaa0;DiskVolume=PureDiskVolume;DiskPool=msdp_srv0media-nb;Path=PureDiskVolume;StorageServer=srv0media-nb;MediaServer=sv0oas
Sep 25, 2017 2:51:18 PM - granted resource msdp_srv0media-nb-stu
Sep 25, 2017 2:51:18 PM - estimated 0 kbytes needed
Sep 25, 2017 2:51:18 PM - Info nbjm (pid=1167) started backup (backupid=srv0rac-scan_1506369078) job for client srv0rac-scan, policy Pol_srv_Produccion, schedule Default-Application-Backup on storage unit msdp_srv0media-nb-stu
Sep 25, 2017 2:51:18 PM - started process bpbrm (pid=22182)
Sep 25, 2017 2:51:26 PM - Info bpbrm (pid=22182) srv0rac-scan is the host to backup data from
Sep 25, 2017 2:51:26 PM - Info bpbrm (pid=22182) reading file list for client
Sep 25, 2017 2:51:26 PM - connecting
Sep 25, 2017 2:51:26 PM - Info bpbrm (pid=22182) listening for client connection
Sep 25, 2017 2:51:28 PM - Info bpbrm (pid=22182) INF - Client read timeout = 300
Sep 25, 2017 2:51:28 PM - Info bpbrm (pid=22182) accepted connection from client
Sep 25, 2017 2:51:28 PM - Info dbclient (pid=19810) Backup started
Sep 25, 2017 2:51:28 PM - Info bpbrm (pid=22182) bptm pid: 22205
Sep 25, 2017 2:51:28 PM - connected; connect time: 0:00:00
Sep 25, 2017 2:51:29 PM - Info bptm (pid=22205) start
Sep 25, 2017 2:51:29 PM - Info bptm (pid=22205) using 262144 data buffer size
Sep 25, 2017 2:51:29 PM - Info bptm (pid=22205) using 30 data buffers
Sep 25, 2017 2:51:29 PM - Info sv0oas (pid=22205) Using OpenStorage client direct to backup from client srv0rac-scan to srv0media-nb
Sep 25, 2017 2:51:31 PM - begin writing
Sep 25, 2017 2:51:34 PM - Info dbclient (pid=19810) dbclient(pid=19810) wrote first buffer(size=262144)
Sep 25, 2017 2:51:58 PM - Info dbclient (pid=19810) dbclient waited 35 times for empty buffer, delayed 35 times
Sep 25, 2017 2:51:58 PM - Info dbclient (pid=19810) done. status: 0
Sep 25, 2017 2:52:01 PM - Critical bptm (pid=22205) sts_close_handle failed: 2060018 file not found
Sep 25, 2017 2:52:01 PM - Critical bptm (pid=22205) cannot write image to disk, media close failed with status 2060018
Sep 25, 2017 2:52:02 PM - Info sv0oas (pid=22205) StorageServer=PureDisk:srv0media-nb; Report=PDDO Stats (multi-threaded stream used) for (srv0media-nb): scanned: 1813025 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
Sep 25, 2017 2:52:02 PM - Critical bptm (pid=22205) sts_close_server failed: error 2060005 object is busy, cannot be closed
Sep 25, 2017 2:52:16 PM - Info bptm (pid=22205) EXITING with status 87 <----------
Sep 25, 2017 2:52:16 PM - Info dbclient (pid=19810) done. status: 87: media close error
Sep 25, 2017 2:52:16 PM - end writing; write time: 0:00:45
media close error (87)

 

3 REPLIES 3

watsons
Level 6

Is it only failing for Oracle (or database) type of backups?

If it does not fail with filesystem backup, my take is that the connection ports usage could be maximising. As we know DB backups require more ports. 

Do a "netstat -an" to check the TIME_WAIT or CLOSE_WAIT , see how many of those, if you have anything more than 100 it's considered high. You can either reboot the server or lower TCP keepalive time on it:

https://www.veritas.com/support/en_US/article.TECH202675

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Which NBU version and OS on MSDP media server?
Which NBU version on Oracle client?

Have you tried to disabled client-side dedupe as a workaround?

Versions are important as there are known issues with certain NBU versions,
Such as this one: https://www.veritas.com/support/en_US/article.000020792

 

Nicolai
Moderator
Moderator
Partner    VIP   

Looks like a issue with the MSDP pool

sts_close_handle failed: 2060018 file not found
Sep 25, 2017 2:52:01 PM - Critical bptm (pid=22205) cannot write image to disk, media close failed with status 2060018
Sep 25, 2017 2:52:02 PM - Info sv0oas (pid=22205) StorageServer=PureDisk:srv0media-nb; Report=PDDO Stats (multi-threaded stream used) for (srv0media-nb): scanned: 1813025 KB, CR sent: 0 KB, CR sent over FC: 0 KB, dedup: 100.0%, cache disabled
Sep 25, 2017 2:52:02 PM - Critical bptm (pid=22205) sts_close_server failed: error 2060005 object is busy, cannot be closed

Are endpoint protection or antivirus software scanning the MSDP area ?

Related technote about same error message:

https://www.veritas.com/support/en_US/article.000017962

https://www.veritas.com/support/en_US/article.000020792