11-02-2014 06:35 AM
i Hi Everyone,
I recently started to see some media write errors on some of my backup jobs. We currently write all of our backups to 10 IBM LTO4 tape drives. The tape library is a Quantum-I500. In the past when I see these errors it indicates an issue on the physcially tape library. I currently do not see any physical issues with any of the tape drives.
Please let me know if anyone else has experiened this error.
Thanks,
Kyle
11/2/2014 8:13:17 AM - Info nbjm(pid=1120) starting backup job (jobid=635387) for client golden105, policy DB-GDB28, schedule Default-Application-Backup
11/2/2014 8:13:17 AM - Info nbjm(pid=1120) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=635387, request id:{E0BE9EF7-2FD5-483E-8CDE-EED495832159})
11/2/2014 8:13:17 AM - requesting resource nbmedia102-hcart2-robot-tld-2
11/2/2014 8:13:17 AM - requesting resource veritasarsenal.NBU_CLIENT.MAXJOBS.golden105
11/2/2014 8:13:17 AM - requesting resource veritasarsenal.NBU_POLICY.MAXJOBS.DB-GDB28
11/2/2014 8:13:18 AM - granted resource veritasarsenal.NBU_CLIENT.MAXJOBS.golden105
11/2/2014 8:13:18 AM - granted resource veritasarsenal.NBU_POLICY.MAXJOBS.DB-GDB28
11/2/2014 8:13:18 AM - granted resource 090376
11/2/2014 8:13:18 AM - granted resource IBM.ULTRIUM-TD4.000
11/2/2014 8:13:18 AM - granted resource nbmedia102-hcart2-robot-tld-2
11/2/2014 8:13:18 AM - estimated 0 Kbytes needed
11/2/2014 8:13:18 AM - Info nbjm(pid=1120) started backup (backupid=golden105_1414933998) job for client golden105, policy DB-GDB28, schedule Default-Application-Backup on storage unit nbmedia102-hcart2-robot-tld-2
11/2/2014 8:58:42 AM - end writing
media write error(84)
Solved! Go to Solution.
11-05-2014 12:47 AM
The tape alert means ...
Flag 3: Hard error. Severity: Warning
Flag 6: Write failure. Severity: Critical
Flag 20: Cleaning required. Severity: Critical
Flag 39: Diagnostics required. Severity: Warning
Maybe a bad drive, or one that just needs cleaning ....
11-02-2014 12:04 PM
Status 84 is an I/O error.
It can be caused by anything on the I/O path - server hba (including firmware and/or driver), tape driver, cable. gbic, switch port, tape drive, media.
You need logging to pinpoint the problem.
Add VERBOSE entry to ...\volmgr\vm.conf, create bptm folder under ...\netbackup\logs, then restart NBU Device Manager service.
After next error, check the bptm log as well as Windows Event Viewer System and Application log.
11-02-2014 01:50 PM
So the errors seem to have stopped but I noticed now that some of the drive path randomly go down. Could that still be related to drive problems?
11-02-2014 02:01 PM
11-02-2014 08:19 PM
Error 84 may or may not be a physical tape issue. Look at it this way:
It is a "media write error", so meaning whichever component involved in this process can be the cause.
bptm is the "netbackup process" that deals with the "tape drive" to write the aleady-read data into the "physical/virtual tape".
So if the issue is not "physical/virtual tape", it can still be "tape drive" or "netbackup process". Test to make sure "tape drive" is working fine, then only look at "netbackup process". The process requires a "connection" which can be a network, LAN or SAN, so check the "connection". It is still possible a "netbackup bug" of bptm binary, but it is rare, you will need to supply logs for Netbackup support to verify if that's the case.
11-02-2014 09:14 PM
Status 84 errors that stop could possibly be indication of bad media. That is why you need logs.
<install-path>\veritas\netbackup\db\media\errors file on the media server is a good starting place.
11-03-2014 11:32 AM
Thanks for the replies everyone. I am looking through the logs now and will reply with what I find.
11-03-2014 11:47 AM
I took a look at <install-path>\veritas\netbackup\db\media\errors and I do see error on both media servers that connect to these 10 LTO4 tape drives.
Does anyone know what these errors mean?
First Media server
11/02/14 17:11:24 090282 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/02/14 17:14:43 092492 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 17:29:43 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/02/14 19:16:47 090282 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 19:16:49 092492 -1 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/02/14 19:16:50 900022 6 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/02/14 19:16:51 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/02/14 20:17:44 092492 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 20:23:31 900022 6 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/02/14 20:32:55 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/02/14 21:44:27 090675 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/02/14 22:10:10 092664 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 22:34:15 090675 -1 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/03/14 00:57:47 090675 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 06:03:27 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/03/14 06:03:49 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 06:04:07 090579 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 07:03:57 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 07:05:21 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/03/14 07:13:27 900003 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 08:06:16 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/03/14 08:13:07 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 08:35:41 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 08:58:09 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 09:18:46 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/03/14 09:19:28 092377 7 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/03/14 09:36:46 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 10:39:16 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 11:38:54 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 12:17:16 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 12:30:00 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 12:32:42 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 13:43:59 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
Second Media Server
10/31/14 20:42:45 092442 6 OPEN_ERROR IBM.ULTRIUM-TD4.006
10/31/14 20:42:45 092442 6 WRITE_ERROR IBM.ULTRIUM-TD4.006
10/31/14 20:42:48 092442 6 TAPE_ALERT IBM.ULTRIUM-TD4.006 0x24001000 0x02000000
11/01/14 14:23:01 092764 0 WRITE_ERROR IBM.ULTRIUM-TD4.000
11/01/14 18:18:27 092768 0 WRITE_ERROR IBM.ULTRIUM-TD4.000
11/02/14 07:31:08 092491 9 OPEN_ERROR IBM.ULTRIUM-TD4.007
11/02/14 07:35:19 092491 9 OPEN_ERROR IBM.ULTRIUM-TD4.007
11/02/14 07:39:30 092491 9 OPEN_ERROR IBM.ULTRIUM-TD4.007
11/02/14 07:43:40 092491 9 OPEN_ERROR IBM.ULTRIUM-TD4.007
11/02/14 07:43:40 092491 9 WRITE_ERROR IBM.ULTRIUM-TD4.007
11/02/14 08:58:42 090376 0 WRITE_ERROR IBM.ULTRIUM-TD4.000
11/02/14 11:03:24 090141 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 12:03:38 090141 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 14:02:54 090141 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 16:43:05 092681 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 19:47:12 090376 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 22:47:27 091189 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 08:05:36 092412 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 09:06:28 092665 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 10:07:55 090390 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 11:09:11 090390 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11-03-2014 11:48 AM
I took a look in <install-path>\veritas\netbackup\db\media\errors for errors and I do see error on both media servers that use these 10 LTO4 tape drives
Does anyone know what these errore mean?
First Media server
11/03/14 06:03:49 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 06:04:07 090579 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 07:03:57 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 07:05:21 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/03/14 07:13:27 900003 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 08:06:16 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.001
11/03/14 08:13:07 092373 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 08:35:41 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 08:58:09 092538 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 09:18:46 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/03/14 09:19:28 092377 7 OPEN_ERROR IBM.ULTRIUM-TD4.004
11/03/14 09:36:46 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 10:39:16 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 11:38:54 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 12:17:16 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 12:30:00 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.003
11/03/14 12:32:42 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 13:43:59 092750 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
Second Media Server
11/02/14 16:43:05 092681 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 19:47:12 090376 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/02/14 22:47:27 091189 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 08:05:36 092412 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 09:06:28 092665 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 10:07:55 090390 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11/03/14 11:09:11 090390 -1 OPEN_ERROR IBM.ULTRIUM-TD4.000
11-04-2014 07:52 AM
Wow thats a fine selction of errors across a selection of drives : 0, 1, 3 and 4. I would hazard a guess theres a mix of both drive issues and tape issues. 092750 is a repeater for example, probably a media issue.
I would force clean all the drives mentioned then get a known brand new tape and experiment in the drives in turn: a write and a read. If thats all good then you know drives are fine when the media is, then I'd look at the media detailed in the logs.
Anything more from tapealert and or your robotic manager? You can lookup the tapealert flags.
Jim
11-05-2014 12:47 AM
The tape alert means ...
Flag 3: Hard error. Severity: Warning
Flag 6: Write failure. Severity: Critical
Flag 20: Cleaning required. Severity: Critical
Flag 39: Diagnostics required. Severity: Warning
Maybe a bad drive, or one that just needs cleaning ....