06-27-2012 05:24 AM
Dear All
Following is the isuue that is faced by us. When we start the backup process that data gets written to the cartridges but before all data is written an end of tape error occurs and the tape cartridge goes in a frozen state.
Following is troubleshooting done
1) Checked the media ids by going to the management page of the autoloader and they are proper . that is no misreading of the media id by the tape drive
2) Checked the write protect notch and data can be written to the cartridge
3) Cleaned the head of the tape drive by using cleaning cartridge
4) The data in the tape is a data that can be understood by netbackup
5) Also checked no data from the netbackup catalog (catalog backup) is written on the tape cartridge
Kindly provide a solution to the issue
nitin
06-27-2012 05:40 AM
Please provide following info:
NBU version on media server
OS (including version number and bit level)
Drivers used for tape drives (especially when Windows media server is used)
OS Event and/or System log (different for all supported OS's).
Please bear the following in mind when troubleshooting media/devices:
As an application, NetBackup has no direct access to a device, instead relying on the operating system (OS) to handle any communication with the device. This means that during a write operation NetBackup asks the OS to write to the device and report back the success or failure of that operation. If there is a failure, NetBackup will merely report that a failure occurred, and any troubleshooting should start at the OS level. If the OS is unable to perform the write, there are three likely causes; OS configuration, a problem on the SCSI path, or a problem with the device.
06-27-2012 05:49 AM
What is the exact error - is it "Physical end of media has been reached" (or similar ) ?
If, so this is the Technote:
http://www.symantec.com/docs/TECH139306
Note, EEB 2182228 is an update binary which is the one that should be requested.
Martin
06-28-2012 01:48 AM
Dear Marianne madam
Thanks for the response
NBU version - Symantec Netbackup 7.1
OS - RHEL 5.3 Server 64 bit edition
For tape drive - It is a Dell autoloader (Powervault T4000) . It is connected to the EMC VNX 5300(NDMP Host)
Thanks and Regards
Nitin
06-28-2012 02:35 AM
You have completey ignored my post, and not provided all the detials Marianne mentioned ...
We still cannot answer the question, as the there are no detaisl of the issue.
(1)
What is the exact error message you see - does it match the TN / details I posted.
(2)
Please supply the system logs also.
Do part (1) first, then (2).
Then ...
(3) Goto https://www-secure.symantec.com/connect/forums/netbackup-basics-and-how-make-your-life-easier
Read parts (B) and (V)
Many thanks,
Martin
06-28-2012 05:29 AM
Dear Martin,
As stated the error message that we get is a
MEDIA WRITE ERROR MESSAGE
I am attaching the bptm logs as stated by you.
Thanks and regards
Nitin
06-28-2012 11:22 PM
OK, thank you for the logs. It is always important to give the exact error message as you have now done.
I see in the log, as you explained, the backup is writtensuccessfully to previous media, and the issue happens when another media is loaded.
03:23:01.926 [23918] <2> signal_parent: sending SIGUSR1 to bpbrm (pid = 23915)
which I guess rules out a previous known issue where the wrong path was used after EOM. That was when the drives were attached I think directly to the filer, but after a new tape was loaded an OS path was used in error ... we do not see that .
But ...
There is a bug at 7.1 where NDMP backups end in status 84 after EOM has occurred, this matches exactly.
S2132186 2131806 NDMP backupswould endwith a Status 84,when EOM was invoked.
It is ETrack 2131806
So first thing is to check out the details of this eTrack.
Martin
06-28-2012 11:29 PM
OK, would need the ndmpagent log ...
Set it up like this :
vxlogcfg -a -p 51216 -o 134 -s DebugLevel=6 -s DiagnosticLevel=6
BUT ...
This eTrack / Bug appears to ONLY affect BlueArc filers - you have EMC VNX 5300.
Now, is it possible the EMC device is a rebadged BlueArc - I do not know ? If it is, it would mean it is quite likely this issue is as per the etrack.
HDS bought BlueArc I believe - now HDS make disk arrays sold by HP (as the XP range) so it is quite possible that they sell kit to EMC ???
Any how, at 7.1 the log should be in /usr/openv/logs/ndmpagent - if you post up the log files you have that match the date of the error (06/27/12) maybe we will see something. If you do not have the logs, or they are not detailed enough, you'll have to increase the details (command above) and wait for the error again. Then send in the log AND a new bptm log to match (logs must ALWAYS) cover the same time period.
Having just said all that - I see in the etrack logs that the bptm freezes the media with the "external event caused rewind messagae) which I don't see in your log (but only looked in one so far, you posted two up).
This may not mean the bug doesn't match, on different systems the symptoms may be slightly different. In my view - 'it's close enough' is valid enough to continue to investiage a possible cause. Maybe you spend ages investigating it only to prove it is not the issue after all, but that's how it is with troubleshooting such a complex product, some you win, some you lose. If nothing else, you elminate it from the cause of possibilities.
If this is not a rebadged BlueArc filer , I guess you could still have the same/ similar issue. So far it has only been seen on one BlueArc filer (only one customer) but who is to say you are are not the second ...
From the eTrack, this is what we would be looking for in the ndmpagent log ...
The ndmpagent log shows that it recieves the signal to stop, and then attempts to repostition media, presuming a rewind, with NDMP_TAPE_MTIO, followed by an ABORT message recieved
I can't put any more details than that up on the public forum, but it's easy enough to spot in the log.
Look for a line containing NDMP_TAPE_MTIO followed afterwards (poss a few lines down) by a line containing ABORT.
You can also consider this ...
What is the chance that you have an NDMP device showing the same symptoms as a known issue, at the same version of NBU (7.1) but has a different cause ???
I don't know - but I would certainly say that the possibility of this iss ue being the same should be investigated.
Apologies if it turns out not to be - but until you look you will never know. I have learnt the hard way not to say, yea, it's close but probably isn't the same problem, let's look elsewhere ... Only to come back some time later and find it is the casue ...
It's a gamble ... are you feelin lucky ... :0)
If it is the same, you will have to log a call, as the EEB that was released was written specifically for ine customer, and cannot be given out via a technote, it has to be cleared by backline/ engineering, or even a new EEB produced.
Martin
06-29-2012 02:37 AM
Dear Martin sir
Thanks for your detailed observation . As stated by you i am attaching the ndmpagent logs so that the problem can be further analyzed
Thanks for your prompt help regarding this issue
Nitin
06-29-2012 03:26 AM
NOt that much in here - I the log level is not too high ...