First off, I'd highly recommend that you upgrade to NBU 6.0 or NBU 6.5. NBU 5.1 is now an end-of-support product (as of March 31, 2008). Symantec will not issue any more patches for it and they will not offer any phone tech support for it.
Do the tape drives go down when you try to use them for a backup job, or do they go down even during idle periods (when there are no backups running)?
If they only go down when you try to run a backup to them, it's possible that they could be mapped incorrectly in NBU. In other words, NBU thinks that device c64t0l0 is drive 1 in the library, when in reality it's drive 2 in the library. When NBU tells the library to mount a tape in drive 1, the library happily mounts a tape in what it thinks is drive 1, but then NBU never sees a tape show up in device c64t0l0 (because the tape actually ended up in device c96t0l0). NBU will then down device c64t0l0, thinking that it's malfunctioning.
Are the tape drives which are attached to the Celerra also zoned to other hosts in the SAN, or is the Celerra the only device zoned to see the tape drives? These tape drives need to be zoned to that only the Celerra data mover has access to them. You cannot share SAN tape drives between an NDMP host and other hosts in NBU 5.1. (This feature was added in NBU 6.0 and NBU 6.5, though.)
Have these tape drives ever worked? If so, what changed around the time when they started going down?
When you defined the NDMP tape drives in the NBU Admin Console, did you use the fully-qualified hostname of the Celerra (sm3apnas05.ap.lilly.com) or did you use the short name of the Celerra (sm3apnas05)? NetBackup is picky about this when it comes to NDMP devices and NDMP backups. If you defined the NDMP tape drives with only the short host name, try changing it to use the fully-qualified host name so that it exactly matches the information that NBU has in it's NDMP authorization database.
You might also want to upgrade to
NetBackup 5.1 MP7 (if you don't want to upgrade to NBU 6.0 or NBU 6.5). There were a few NDMP-related bugs in NBU 5.1 that were fixed in NBU 5.1 MP7:
--------------------------------------------------------------------------------
Etrack Incident = ET830908
Description:
Using long hostnames for NDMP hosts may result in the credentials being added incorrectly.
--------------------------------------------------------------------------------
Etrack Incident = ET968678
Associated Primary Etrack = ET805849
Titan cases: 290-374-573
Description:
NDMP environment variables passed using a file list to the filer is only being honored for the first volume in the file list.
Workaround:
To resolve this issue, set the environment variable after each entry in the file list.
Additional Notes:
This problem also existed in NetBackup 5.1MP5 and 5.1MP6.
--------------------------------------------------------------------------------
Etrack Incident = ET1020130
Associated Primary Etrack = ET1003895
Titan cases: 220-088-034
Description:
A successful NDMP restore may produce a status code 83 when the media is write protected and spanning media. The following error is logged in the bptm log file.
15:51:14.479 [6272.8404] <16> open_ndmp_device: cannot open ndmp device nrst07a, error code 11 (NDMP_WRITE_PROTECT_ERR)
Workaround:
If you receive this error, you can ignore the error status because the restore completes successfully, or unmark the tape not to be write protected.
--------------------------------------------------------------------------------
Etrack Incident = ET1026333
Associated Primary Etrack = ET1018020
Titan cases: 230-335-679
Description:
A restore attempt would fail when using a duplicated copy of a standard backup image that was duplicated using an NDMP attached tape drive. An exit status 25 (cannot connect on socket) would be the result. The root cause of this problem was that bpdm was started instead of bptm.
--------------------------------------------------------------------------------
Etrack Incident = ET1051661
Description:
Restore files on the second tape of an NDMP backup would hang and fail.
--------------------------------------------------------------------------------
Etrack Incident = ET1094378
Associated Primary Etrack = ET1086614
Titan cases: 311-593-827
Description:
If a directory name contained Japanese characters then an NDMP restore of an item in that directory would fail. This issue only applied to Microsoft Windows platforms.
--------------------------------------------------------------------------------
Etrack Incident = ET647774
Description:
An NDMP restore would mount the wrong fragment after mover_paused, reason EOF. This problem occurred with Overland Storage, but might occur with other NDMP servers as well.
The restore would fail and the bptm log showed that either a fragment was used twice in a row, or a fragment that should not have been skipped was skipped.
--------------------------------------------------------------------------------
Etrack Incident = ET848039
Associated Primary Etrack = ET841853
Titan cases: 280-687-557 290-397-014
Description:
Incremental NDMP backups would fail with status code 99 when no data has changed. The problem occurred because of an incremental backup that had no changed data, and thus, was a 0-byte backup.
A change was made to add support for backup, restore, duplicate, verify, and import of 0-byte backups.
--------------------------------------------------------------------------------
Etrack Incident = ET1204221
Associated Primary Etrack = ET1114189
Titan cases: 220-091-399
Description:
A change was made to correct a TLDCD handle leak that would occur when using an NDMP control path for robotic control and the robot had an error that prevented the open from succeeding.