cancel
Showing results for 
Search instead for 
Did you mean: 

Strange "hung up" with Unix backups

Pamela_S
Level 3
Hi all,

We have

Master & Media Server: Windows Standart 2003, SP2, X86.  6.5.3 (recently updated with no problems)
2 drives, working perfectly with LTO3, on a dell ML6000.

And many clients, with Unix (6.5) and others with Windows (6.5 or 6.5.1)

Few days ago, backups went crazy. Services went down with no reason and no record of error in the event viewer of windows, finally after a lot of restarts and testing we found out that there is one policy (let's call it XPolicy) that backups an unix client with a lot of streams, each stream with a diferent file system.  3 file systems of this XPolicy, no matter what kind of backup we try to get, just "hung up", for example:

Xpolicy, on work hours, with tape A, full backup: hungs up
Xpolicy, on after work hours, with tape A, full backup: hungs up
Xpolicy, on work hours, with tape A, incremental backup: hungs up
Xpolicy, on after work hours, with tape A, incremental backup: hungs up

and no matter what, changing all tapes it still hungs up. Even better, it hungs up BOTH drives, the one that Xpolicy is actually "using" and the other one that is resting. So when you go to NetBackup admin Console , you see as if both drives were working...but if you go and see the robot, only one drive is working and has a tape inside. To make things even harder, if you try to start another policy (for example a windows one) to force the "resting" drive to work, it does nothing, and finally the only thing left to do is restart the server (not even veritas services, that's not enough). That's why all other backups started failing, because some point in our backup window, XPolicy cause an error and affected subsequent backups.
 
This is not a problem of the drives or the robot, because we tested all other policies with full and incremental backups and everthing works fine.

One more thing: This only happens with 3 file systems of Xpolicy, but Xpolicy has configured about 10 streams with another file systems that backup perfectly. So, if we remove this 3 file systems of Xpolicy, no problems at all!

Any ideas why this is happening? we think that maybe is a problem of 6.5.3 on windows vs 6.5 on unix, but..if that was the case all other unix clients would fail right?

Another idea was that this file systems are huge, too big to backup, but that sounds crazy. And finally, the unix server shows nothing...

Any ideas would be really nice...

Pamela.






4 REPLIES 4

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

- detail of hanging job on Activity Monitor
- output of install_path\volmgr\bin\vmoprcmd on all NB server in hanging state
- output of "install_path\NetBackup\bin\admincmd\nbrbutil -dump" on master server in hanging state
- file install_path\Netbackup\db\jobs\trylogs\jobid.t on master serer

rj_nbu
Level 6
Employee Accredited Certified
 Pamela,

verbose bobkar logs would show you what exactly is happening on the client. Also it might be possible that this policy got corrupted.

try creating a new policy with same backup selection , and see if it makes a difference

-Rajeev 

Pamela_S
Level 3
Well, it seems that the easiest solution is the one that works (after 4 days of no backups, an open case with support and almost consulting to a medium).
I created the new policy with the old backup selection and it's working as if never had a problem!!!!  never imagined that a a corrupted policy could cause such a mess..
Thanks a lot!!!! :)


Douglas_A
Level 6
Partner Accredited Certified
sounds like the UNIX FileSystem may be screwed up.

Have the UNIX SA Run FSCK on the FS on the box and see if that clears up your issue.