Error bptm (pid=xxxxx) error requesting media, TpErrno = Robot operation failed
Hello all,
We are currently having issues with one of our tape library, which have 6 tape drives in this library where 5 of them are currently down. Got this error as per mentioned in the subject, where eventually the said drive will downed by Veritas.
We able to perform inventory from NBU console also moving medias with robtest but still the issue persist. No error on the physical tape library. We tried to bring up one drive, let the duplication run on this drive, wait until media loaded into the drive and then run robtest. Everything look okay at first but eventually like an hour later there will be this error and the drive down again. This happen on 5 out of 6 drives.
11/09/2019 08:15:46 - requesting resource LCM_mssba572-hcart-robot-tld-1 11/09/2019 08:15:48 - Info nbrb (pid=7708) Limit has been reached for the logical resource LCM_mssba572-hcart-robot-tld-1 11/09/2019 08:17:57 - granted resource LCM_mssba572-hcart-robot-tld-1 11/09/2019 08:18:01 - started process RUNCMD (pid=159288) 11/09/2019 08:18:01 - ended process 0 (pid=159288) 11/09/2019 08:18:03 - begin Duplicate 11/09/2019 08:18:17 - requesting resource mssba572-hcart-robot-tld-1 11/09/2019 08:18:17 - requesting resource @aaaak 11/09/2019 08:18:17 - reserving resource @aaaak 11/09/2019 08:18:19 - awaiting resource mssba572-hcart-robot-tld-1. Waiting for resources. Reason: Tape media server is not active, Media server: mssba572, Robot Type(Number): TLD(1), Media ID: N/A, Drive Name: N/A, Volume Pool: FS_DAILY_POOL, Storage Unit: mssba572-hcart-robot-tld-1, Drive Scan Host: N/A, Disk Pool: N/A, Disk Volume: N/A 11/09/2019 08:20:34 - awaiting resource mssba572-hcart-robot-tld-1. Waiting for resources. Reason: Robotic library is down on server, Media server: mssba572, Robot Type(Number): TLD(1), Media ID: N/A, Drive Name: N/A, Volume Pool: FS_DAILY_POOL, Storage Unit: mssba572-hcart-robot-tld-1, Drive Scan Host: N/A, Disk Pool: N/A, Disk Volume: N/A 11/09/2019 08:20:45 - resource @aaaak reserved 11/09/2019 08:20:45 - granted resource 0938L4 11/09/2019 08:20:45 - granted resource Drive002 11/09/2019 08:20:45 - granted resource mssba572-hcart-robot-tld-1 11/09/2019 08:20:45 - granted resource MediaID=@aaaak;DiskVolume=PureDiskVolume;DiskPool=dp_disk_mssba572;Path=PureDiskVolume;StorageServer=mssba572;MediaServer=mssba572 11/09/2019 08:20:47 - Info bpduplicate (pid=159288) window close behavior: Suspend 11/09/2019 08:29:41 - current media 0938L4 complete, requesting next media Any 11/09/2019 08:30:02 - current media -- complete, awaiting next media Any. Waiting for resources. Reason: Drives are in use, Media server: mssba572, Robot Type(Number): TLD(1), Media ID: N/A, Drive Name: N/A, Volume Pool: FS_DAILY_POOL, Storage Unit: mssba572-hcart-robot-tld-1, Drive Scan Host: N/A, Disk Pool: N/A, Disk Volume: N/A 11/09/2019 08:47:57 - granted resource 0938L4 11/09/2019 08:47:57 - granted resource Drive002 11/09/2019 08:47:57 - granted resource mssba572-hcart-robot-tld-1 11/09/2019 08:55:35 - current media 0938L4 complete, requesting next media Any 11/09/2019 08:55:51 - current media -- complete, awaiting next media Any. Waiting for resources. Reason: Drives are in use, Media server: mssba572, Robot Type(Number): TLD(1), Media ID: N/A, Drive Name: N/A, Volume Pool: FS_DAILY_POOL, Storage Unit: mssba572-hcart-robot-tld-1, Drive Scan Host: N/A, Disk Pool: N/A, Disk Volume: N/A 11/09/2019 09:17:55 - Error bpduplicate (pid=159288) Duplicate of backupid MSGSA578_1573214406 failed, termination requested by administrator (150). 11/09/2019 09:17:55 - Error bpduplicate (pid=159288) Status = no images were successfully processed. 11/09/2019 09:17:55 - end Duplicate; elapsed time 0:59:52 11/09/2019 15:41:35 - Info bptm (pid=166565) start 11/09/2019 15:41:35 - started process bptm (pid=166565) 11/09/2019 15:41:41 - Info bptm (pid=166565) start backup 11/09/2019 15:41:41 - Info bpdm (pid=166601) started 11/09/2019 15:41:41 - started process bpdm (pid=166601) 11/09/2019 15:41:42 - Info bpdm (pid=166601) reading backup image 11/09/2019 15:41:44 - Info bpdm (pid=166601) using 128 data buffers 11/09/2019 15:41:44 - Info bptm (pid=166565) Waiting for mount of media id 0938L4 (copy 2) on server mssba572. 11/09/2019 15:41:44 - Info bpdm (pid=166601) requesting nbjm for media 11/09/2019 15:41:44 - started process bptm (pid=166565) 11/09/2019 15:41:44 - mounting 0938L4 11/09/2019 15:41:44 - Info bptm (pid=166565) INF - Waiting for mount of media id 0938L4 on server mssba572 for writing. 11/09/2019 15:41:50 - begin reading 11/09/2019 15:50:28 - Error bptm (pid=166565) error requesting media, TpErrno = Robot operation failed 11/09/2019 15:50:28 - Warning bptm (pid=166565) media id 0938L4 load operation reported an error 11/09/2019 16:08:44 - Info bptm (pid=166565) Waiting for mount of media id 0938L4 (copy 2) on server mssba572. 11/09/2019 16:08:44 - started process bptm (pid=166565) 11/09/2019 16:08:44 - mounting 0938L4 11/09/2019 16:08:44 - Info bptm (pid=166565) INF - Waiting for mount of media id 0938L4 on server mssba572 for writing. 11/09/2019 16:16:21 - Error bptm (pid=166565) error requesting media, TpErrno = Robot operation failed 11/09/2019 16:16:22 - Warning bptm (pid=166565) media id 0938L4 load operation reported an error 11/09/2019 16:38:37 - Error bpdm (pid=166601) media manager terminated by parent process 11/09/2019 16:38:39 - Error bptm (pid=166565) media manager terminated by parent process 11/09/2019 16:38:48 - Info mssba572 (pid=166601) StorageServer=PureDisk:mssba572; Report=PDDO Stats for (mssba572): read: 65576 KB, CR received: 15633 KB, CR received over FC: 0 KB, dedup: 0.0% termination requested by administrator (150)
We also tried delete all 6 drives then reconfigure storage device but when the duplication job run, the same thing happened again.
We are using
Veritas Netbackup Appliance 5230
Version 2.7.3 (in the midst of upgrade)
IBM TS3310 with LTO4 drives
Thank you
mizi