cancel
Showing results for 
Search instead for 
Did you mean: 

How to kill hung jobs

PARDHU
Level 2
Hi,
 
I am not able to cancel or kill hung jobs in netbackup 5.0 MP7 running on solaris 9 OS. I have to restart the netbackup service in order to kill or cancel hung jobs. Hung jobs won't show the pid no to kill. Is there any way to kill these hung jobs in better way without restarting the netbackup services?
 
Folks any one could you please help me on this.
 
Thanks in advance.
6 REPLIES 6

Omar_Villa
Level 6
Employee
Check bpjobd is up and restart it: nohup bpjobd &
if dont works restart services or deamons
netbackup stop
ipcs -qa: list process that still are running
ipcrm -q <id number: stops hanged process
 
Once all message queues are cleared you will want to delete the worklist files and the *lock files from /usr/openv/netbackup/bin/bpsched.d. Do the following in that directory:
 
rm /usr/openv/netbackup/bin/bpsched.d/worklist.[0-9]*
rm /usr/openv/netbackup/bin/bpsched.d/*.lock
 
On Netbackup 6
 
rm /usr/openv/netbackup/bin/bpsched.d/pempersist
netbackup start
 
this is the only way I have found that realy works.
regards

Nicolai
Moderator
Moderator
Partner    VIP   

I have found no way to get around the problem. But the problem is not unknown, we are running 5.1 MP6 (HP-UX) with various EEB to fix the problem.

Regards

Nicolai

dukbtr
Level 4
Do these hung jobs show up in the activity monitor as blank, only a job id?  This is the kind we get from time to time.  Right click gives nothing to do with it.

What we did is:

Create a policy called "Hung_Job" or whatever, backing up whatever - did this so we didn't have to speak to backup in the morning. Whenever a hung job comes in, manually kick off the backup job created.  Highlight both the Hung_Job policy and the Hung job - right click - cancel.  Problem solved.

TreyT
Level 3
If it's a job that you can click Cancel on and it just never cancels then there is no way that I know of to clean them up except for a service restart. We're running 6.5.1 now and I just had a hung job for 134 hours and it would not cancel.
 
We have had some instances where a hung job had a bpbackup process running on the master and you could kill -9 that process and then the job aborts usually with a '50' status. But most of the time there's no bpbackup process and it's just an orphaned job.

J_H_Is_gone
Level 6
 
DOCUMENTATION: How to manually remove jobs from the NetBackup Activity Monitor which are in a queued state, and cannot be canceled or killed, or removed by cycling the NetBackup services/daemons.
 
 
I use to have to do this a LOT.

J_H_Is_gone
Level 6
I also did the above steps when I had jobs running that just did nothing.
no tapes mounted and could not kill the jobs.