β08-21-2014 09:31 AM
Hello there,
We have
Issue:"Warning bptm(pid=) cannot locate on drive index 5, No more data is on the tape. "
When NB is trying to write on tape, we get the above error.
This is what we get when SLP are running and doing Duplications.
I found in bptm log, withing the time the job was running:
16:20:06.824 [12680.8964] <2> Orb::setOrbConnectTimeout: timeout seconds: 60(Orb.cpp:1567)
16:20:06.824 [12680.8964] <2> Orb::setOrbRequestTimeout: timeout seconds: 1800(Orb.cpp:1576)
16:20:07.791 [10424.13532] <2> tapelib: wait_for_ltid, Mount, timeout 0
16:20:29.475 [10424.13532] <2> io_ioctl: command (5)MTREW 1 0x0 from (bptm.c.8312) on drive index 4
16:20:29.506 [10424.13532] <2> io_ioctl: command (1)MTFSF 1 0x0 from (bptm.c.8564) on drive index 4
16:25:48.761 [10424.13532] <8> io_position_for_write: cannot locate on drive index 4, No more data is on the tape.
16:25:48.761 [10424.13532] <2> io_ioctl: command (5)MTREW 1 0x0 from (bptm.c.7146) on drive index 4
16:26:14.719 [10424.13532] <2> io_position_for_write: If header checks out, then overwrite it at LBA 0
16:26:14.813 [10424.13532] <2> io_position_for_write: Media at BOT
Please ignore LOT 3 handling, I only use LTO5, the focus is on tape: XO0354.
I attached the logs, please ask if smth is needed in order to gather more data for troubleshooting.
Thank You
//George
Solved! Go to Solution.
β08-29-2014 01:04 PM
β08-29-2014 01:23 PM
Hi mph999,
Yes we are able to update to 7.6.0.3.
I didn`t read yet hte release notes, to make sure we don`t encounter other challenges, at leas not right now.
//George
β08-29-2014 02:47 PM
OK ...
So ... in summary ...
We have multiple tapes that are NOT full, that have no empty header at the end, at least accoding to scsi_command -map
The issue is seen when attempting to position to the end of data, which we cannot do as we can't find an empty header, which is what we look for.
The issue actually happens when the last backup to that tappe was written, so one thing we will have to do is look at the backup logs and see what they say - do we try and write the header and the tape drive 'ignores' the request, of do we just not even try.
Also, once I work out the syntax, we'lll try nt_ttu to position 'outsdie' fo NBU.
Regarding the logs, if the issue is happeneing very frequesntly as you suggest, it could be worth making some very small backups (just a few files) and then checking the tape to see if the header is missing. If this works, we could get the logs and new_scsi map output within a short time.
It's not impossible that the issue happens only on 'big' backups - ok, I can't see how, but when t/shooting you need to consider everything no matter how unlikely - this would make it a bit more time consuming to reproduce.
The other way, is to just turn on bptm at verbose 5, and wait for a failure on a duplication, then look back at the logs - if you are a bit short on disk space, we might have to back the logs up and then delete them, then when we get a failure, restore the log we need.
That's all I can think of at the moment - very odd, as this issue hasn't been reported before - so I suspect it will come down to be a combination of factors that are unique to your environment, eg. driver version + firmware version + NBU version + maybe some OS patch or something ...
I'll also look to see if there is some scsi logging we can turn on, not sure in Windows, certainly you can in Linux/ Solaris - if it's available, Engineering would ask for that so might as well do it now .
So in summary, the next steps as I see them at the moment.
Reproduce issue and get bptm logs + scsi logs (if available) + event logs + mm logs
Get scsi_command map output for new failure (to clearly demonstrate the header is missing
nt_ttiu test (once I've worked out how, sorry, manually page is useless)
We'll also have a chat about the exact history (I know you have mentioned it but we'll confirm the exact details) and maybe look at trying a different tape driver.
Once we have this, if no solution can be found (I'll have a chat with my colleague Pete who is also BL and very very good at tape drive issues), there are also a couple of BL guys in the US who are very good at these sort of things as well.
If that provides no solution, then I'll be in a position to formally esclalate to Engineering.
Hopefully this is acceptable for you,
Regards,
Martin
β11-19-2014 10:14 AM
Hi
We are still in progress but the issue was found, a drive had hardware issues and deleting the empty header from tapes.
Well when the drive was writing on the tape, an error CRC usualy encouter during the writing, and the tapes was ejected without the EmptyHeader at the end of it.
In this way all scratch tapes were taken an used and after a while left without EH because of this hardware issue.
We've done a "simulation" of this issue collected logs from ibm drives and now IBM changed our drive.
Now we need to duplicate the tapes that lack the EH in order to loose this issue.
Issue received from this "story" on every tape that goes in frozen state:
Info bptm(pid=4996) media id XO0255 mounted on drive index 7, drivepath {3,0,12,0}, drivename IBM.ULT3580-TD5.Drive3, copy 2 Info bptm(pid=4996) INF - Waiting for positioning of media id XO0255 on server xxxx for writing. Warning bptm(pid=4996) cannot locate on drive index 7, No more data is on the tape.
We are in tape migration progress, I will let you know what happened next.
//George
β11-19-2014 01:06 PM