cancel
Showing results for 
Search instead for 
Did you mean: 

TpErrno = Robot operation failed

Mehul_Vyas
Level 4

Hello Team,

Getting below error in the activity monitor job.

Nov 30, 2018 8:06:30 PM - Error bptm (pid=45384) error requesting media, TpErrno = Robot operation failed
Nov 30, 2018 8:06:30 PM - Warning bptm (pid=45384) media id CS1922 load operation reported an error
Nov 30, 2018 8:06:30 PM - current media CS1922 complete, requesting next media Any

 

Netbackup media & master is 8.1

IBM tape library.

 

Also while moving the tape from drive to slot via robtest it throws below error

READ_ELEMENT_STATUS complete
m d2 s3
Initiating MOVE_MEDIUM from address 257 to 4098
move_medium ioctl() failed: Success 

21 REPLIES 21

mph999
Level 6
Employee Accredited

If robtest is failing, generally speaking you have a hardware issue of some sort.

m sx dy is just a scsi command that asks the library to move a tape from slot to drive, or drive to slot.

If the robot doesn't do this, it's down to the library.

It could be as simple as a tape stuck in a drive, can you move other tapes to / from other slots/ drives ?

You could try :

unload 2

m d2 s3

If that still fails, and oher drives are working, then can you manually access the robot to remove the tape.

Tried unload but throwing input/output error

 

Robot Selection
---------------
1) TLD 1
2) none/quit
Enter choice: 1

Robot selected: TLD(1) robotic path = /dev/sg6

Invoking robotic test utility:
/usr/openv/volmgr/bin/tldtest -rn 1 -r /dev/sg6

Opening /dev/sg6
MODE_SENSE complete
Enter tld commands (? returns help information)
s d
drive 1 (addr 256) access = 1 Contains Cartridge = no
drive 2 (addr 257) access = 1 Contains Cartridge = yes
Source address = 4126 (slot 31)
Barcode = CS1922
READ_ELEMENT_STATUS complete
unload 2
Opening /dev/nst0, on the local host, please wait...
Error - cannot open /dev/nst0 (Input/output error)
q

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

It seems that the tape is stuck. That is why the software eject command is not working. 

You will need to physically inspect the robot/drive and press the eject button on the drive.

If that does not work, or if there is an error light on the drive, you will need to contact your hardware support team.

After physically checking the drive, unloaded the tape and thereafter getting below error

 

Dec  4 16:13:12 crms0101csc tldd[4034]: TLD(1) MountTape CS1960 on drive 2, from slot 12

Dec  4 16:13:13 crms0101csc tldcd[4092]: Processing MOUNT, TLD(1) drive 2, slot 12, barcode CS1960          , vsn CS1960

Dec  4 16:13:51 crms0101csc tldcd[40508]: TLD(1) expected barcode (CS1960) in slot 12, found barcode (C71960         

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
It seems tapes were manually moved in the robot.

Please run Inventory to update media locations.
Check the screen for errors while Inventory is running.

Hello Marianne,

Tried running inventory and it is successfully completed.

However when running backup it throws below error

Dec 5, 2018 10:39:51 PM - Error bptm (pid=50926) FREEZING media id CS1927, Medium identifiers do not match
Dec 5, 2018 10:39:51 PM - Warning bptm (pid=50926) media id CS1927 load operation reported an error
Dec 5, 2018 10:39:51 PM - current media CS1927 complete, requesting next media Any

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Did anyone change barcodes on tapes?

Yes, it seems so

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You will need to re-label CS1927 so that the internal label matches the external label.

You can only re-label the tape if there are no unexpired images associated with CS1927 or the previous internal media-id.

To re-label, unfreeze the tape, then select media-id, right-click, Label.
De-select the tick-box to verify label.  

 

Perform the steps mentioned below

To re-label, unfreeze the tape, then select media-id, right-click, Label.
De-select the tick-box to verify label.  

Still failing with same.

 

Dec 6, 2018 8:18:19 PM - Error bptm (pid=43893) FREEZING media id CS1927, Medium identifiers do not match
Dec 6, 2018 8:18:19 PM - Warning bptm (pid=43893) media id CS1927 load operation reported an error
Dec 6, 2018 8:18:19 PM - current media CS1927 complete, requesting next media Any

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Are you sure the tick box "Verify media label before performing operation" was blank? (without tick-mark)

Please post bptm log if still getting error.

Copy bptm log on the media server to bptm.txt and upload as attachment.

If bptm log folder does not exist, please create the folder and increase logging level to 3 on this media server for bptm.
Retry the steps, then copy the log and upload please. 

TN describing the error: 
https://www.veritas.com/support/en_US/article.000085936

(There are more TNs and forum posts if you want to Google the error.)

mph999
Level 6
Employee Accredited

I'm not sure that the issue is with the media ID written to the tape not matching the 'barcode' (more acurately, not matching the 'media ID' derived from the barcode.

Until the tape is loaded and the tape header read, we wouldn't know if it matches of not, if it does not match, the error would be

... expected media ID AA1234 found BB4321

What is seen here is medium identifiers do not match

 

... which is a different issue.

NBU is aware of the media serial number and the manufacturer which are read from the RFID chip in the tape, and are written into the NB database nbdb - I think the issue is more around this area.

This data is writen to nbdb when the tape is first used - now it may be that the solution turns out to be a re-label, but first I would try the following.

1.  Move the tape to 'standalone' / non-robotic  (right click tape > move > untic tape is robotic)

2.  Re-inventory library

There are 30 media's which got frozen due to this.. 

Shall I try to move all of them to standalone and do inventory?

 

 

Hello Marianne,

Yes, "Verify media label before performing operation" was blank while doing re-label.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
You can try to move the tapes to Standalone. No harm in that.
Best would be to understand what happened here to cause this. Something like this doesn't just happen for no reason.
Something happened or someone did something. Any idea?
Have you asked your tape management team if anything was done differently recently?

Do you have bptm log that we could look at?

bptm logs..

 

00:02:33.717 [10259] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.58853 TO 127.0.0.1.47138 fd = 0

00:02:33.724 [10259] <2> db_end: Need to collect reply

00:02:33.735 [10259] <16> mount_open_media: error requesting media, TpErrno = Robot operation failed

00:02:33.735 [10259] <4> create_tpreq_file: symlink to path /dev/nst0

00:02:33.736 [10259] <2> manage_drive_before_load: SCSI RELEASE

00:02:33.739 [10259] <2> manage_drive_before_load: report_attr, fl1 0x00100001, fl2 0x00000000

00:02:33.742 [10259] <2> manage_drive_before_load: write mode set to 0

00:02:33.742 [10259] <2> drivename_write: Called with mode 1

00:02:33.742 [10259] <2> drivename_unlock: unlocked

00:02:33.742 [10259] <2> drivename_close: Called for file IBM.ULT3580-HH6.000

00:02:33.742 [10259] <8> vnet_get_user_credential_path: [vnet_vxss.c:1621] status 35 0x23

00:02:33.742 [10259] <8> vnet_check_user_certificate: [vnet_vxss_helper.c:4096] vnet_get_user_credential_path failed 35 0x23

00:02:33.742 [10259] <2> ConnectionCache::connectAndCache: Acquiring new connection for host crbm0100csc, query type 1

00:02:33.746 [10259] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded

00:02:33.770 [10259] <2> logconnections: PROXY CONNECT FROM 10.233.167.41.50778 TO 10.235.102.38.1556 fd = 0

00:02:33.770 [10259] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.41908 TO 127.0.0.1.50062 fd = 0

00:02:33.774 [10259] <2> db_end: Need to collect reply

00:02:33.784 [10259] <8> write_backup: media id CS1960 load operation reported an error

00:02:33.784 [10259] <2> nbjm_media_request: Passing job control to NBJM, type FAIL/9

00:02:33.784 [10259] <2> nbjm_media_request: old_media_id = CS1960, media_id = NULL

00:02:34.675 [10266] <2> send_MDS_msg: OP_STATUS 0 21965580 crms0101csc 26 1 0 0 0 0 0 0 *NULL* 0

00:02:34.688 [10266] <2> send_operation_error: Decoded status = 26 from 1

00:02:34.688 [10266] <8> vnet_get_user_credential_path: [vnet_vxss.c:1621] status 35 0x23

00:02:34.688 [10266] <8> vnet_check_user_certificate: [vnet_vxss_helper.c:4096] vnet_get_user_credential_path failed 35 0x23

00:02:34.688 [10266] <2> ConnectionCache::connectAndCache: Acquiring new connection for host crbm0100csc, query type 1

00:02:34.692 [10266] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded

00:02:34.716 [10266] <2> logconnections: PROXY CONNECT FROM 10.233.167.41.49964 TO 10.235.102.38.1556 fd = 0

00:02:34.716 [10266] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.51552 TO 127.0.0.1.54708 fd = 0

00:02:34.723 [10266] <2> db_end: Need to collect reply

00:02:34.734 [10266] <16> mount_open_media: error requesting media, TpErrno = Robot operation failed

00:02:34.734 [10266] <4> create_tpreq_file: symlink to path /dev/nst1

00:02:34.735 [10266] <2> manage_drive_before_load: SCSI RELEASE

00:02:34.737 [10266] <2> manage_drive_before_load: report_attr, fl1 0x00100001, fl2 0x00000000

00:02:34.740 [10266] <2> manage_drive_before_load: write mode set to 0

00:02:34.740 [10266] <2> drivename_write: Called with mode 1

00:02:34.740 [10266] <2> drivename_unlock: unlocked

00:02:34.741 [10266] <2> drivename_close: Called for file IBM.ULT3580-HH6.002

00:02:34.741 [10266] <8> vnet_get_user_credential_path: [vnet_vxss.c:1621] status 35 0x23

00:02:34.741 [10266] <8> vnet_check_user_certificate: [vnet_vxss_helper.c:4096] vnet_get_user_credential_path failed 35 0x23

00:02:34.741 [10266] <2> ConnectionCache::connectAndCache: Acquiring new connection for host crbm0100csc, query type 1

00:02:34.744 [10266] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded

00:02:34.768 [10266] <2> logconnections: PROXY CONNECT FROM 10.233.167.41.48036 TO 10.235.102.38.1556 fd = 0

00:02:34.768 [10266] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.49856 TO 127.0.0.1.47586 fd = 0

00:02:34.774 [10266] <2> db_end: Need to collect reply

00:02:34.784 [10266] <8> write_backup: media id CS1968 load operation reported an error

00:02:34.785 [10266] <2> nbjm_media_request: Passing job control to NBJM, type FAIL/9

00:02:34.785 [10266] <2> nbjm_media_request: old_media_id = CS1968, media_id = NULL

00:02:35.029 [10283] <2> bptm: INITIATING (VERBOSE = 0): -unload -dn IBM.ULT3580-HH6.000 -dp /dev/nst0 -dk 2000105 -m CS1960 -mk 4000468 -mds 8 -alocid

21965569 -jobid -1543955948 -jm

00:02:35.030 [10283] <2> read_legacy_touch_file: Found /usr/openv/volmgr/database/NO_TAPEALERT; requested from (../bptm.c.1085).

00:02:35.030 [10283] <4> bptm: emmserver_name = crbm0100csc

00:02:35.030 [10283] <4> bptm: emmserver_port = 1556

00:02:35.044 [10283] <2> Orb::init: Enabling ORBNativeCharCodeSet UTF-8(Orb.cpp:953)

00:02:35.045 [10283] <2> Orb::init: initializing ORB EMMlib_Orb with: [bptm] [-ORBSvcConfDirective] [-ORBDottedDecimalAddresses 0] [-ORBSvcConfDirective

] [static Resource_Factory '-ORBNativeCharCodeSet UTF-8'] [-ORBSvcConfDirective] [static PBXIOP_Factory '-enable_keepalive -reply_expected 25'] [-ORBSvc

ConfDirective] [static EndpointSelectorFactory ''] [-ORBSvcConfDirective] [static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'] [-ORBSvcConfDir

ective] [static Resource_Factory '-ORBProtocolFactory IIOP_Factory'] [-ORBDefaultInitRef] [] [-ORBSvcConfDirective] [static PBXIOP_Evaluator_Factory '-o

rb EMMlib_Orb'] [-ORBSvcConfDirective] [static Resource_Factory '-ORBConnectionCacheMax 1024'] [-ORBSvcConf] [/dev/null] [-ORBSvcConfDirective] [static

Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'](Orb.cpp:1165)

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
These few lines at logging level 0 are not helping.

On a side note -
Why do you have this touch file?
/usr/openv/volmgr/database/NO_TAPEALERT

It seems it has been created to disable the tape alert

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
The only time that NO_TAPEALERT is recommended is for ACSLS robot control.
The file should not exist for TLD robots.

We are really trying to help you here, but you are giving us very little to work with.

We need full bptm log as attachment at logging level 3.
Important to have all the log entries around the time that errors are seen.

Please have a look at what I asked last week:

Best would be to understand what happened here to cause this. Something like this doesn't just happen for no reason.
Something happened or someone did something. Any idea?
Have you asked your tape management team if anything was done differently recently?

.... increase logging level to 3 on this media server for bptm.
Copy bptm log on the media server to bptm.txt and upload as attachment.