06-13-2019 05:38 AM - edited 06-13-2019 05:40 AM
Hello, we are using version 8.0; master servers are linux. In the Master admin console gui, we are seeing many job IDs without policy names or status. Clicking on them, if we look at Detailed Staus, we see:
Mar 3, 2019 8:01:56 PM - Info bpbrm (pid=2947) starting bphdb on client
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) cannot execute cmd on client .com
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) socket close failed, Bad file descriptor (9)
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) could not write EXIT STATUS to OUTSOCK
Mar 4, 2019 1:01:56 AM - Info bphdb (pid=0) done. status: 26: client/server handshaking failed
They're from different clients, but all have that basic data. They're all from early March 2019. They're not clearing out; we're not sure if they're consuming resources. We cannot cancel them. Has anyone seen anything like this? I couldn't find the exact same situation, when I did a search.
Solved! Go to Solution.
06-20-2019 01:07 PM
Unfortunately, my "solution" is not a great discovery. Our guess is that when my coworker stopped/started nb services on the master server while working on the unrelated issue, it freed up those hung jobs and let me cancel them.
06-13-2019 09:06 AM
Try to stop the netbackup service from Master server. Run bp.kill_all from the Netbackup installation path and restart the service, let me know the status.
06-13-2019 10:04 AM
@Arkya , won't that effect current jobs running, as opposed to those stuck in limbo status since March 4?
06-13-2019 10:27 AM
Yes, that will impact the current jobs as well, try the steps on non backup hours.
06-13-2019 10:36 AM
@Arkya ,
Good idea! I’ll try it during our regularly scheduled maint. Window on Monday afternoon.
06-14-2019 12:48 AM
There was a similar post over here: https://vox.veritas.com/t5/NetBackup/Ghost-unknown-jobs-in-the-activity-monitor/m-p/832864
Please have a look at replies and see if anything helps.
Unfortunately the original poster never replied.
06-14-2019 05:24 AM
Good morning, when I left work last night, my coworker was on the phone with Veritas support for an unrelated issue (but on the same master server). He left me an email telling me that they fixed his issue. I checked the master gui, and noticed that all those limboed jobs were now showing "Waiting for retry" and "cancel" was no longer grayed out. I was able to cancel all those jobs that were there since March. When my coworker is back on Monday, I'll ask him what was done. Maybe they used the Veritas version of penicillin.
06-20-2019 01:07 PM
Unfortunately, my "solution" is not a great discovery. Our guess is that when my coworker stopped/started nb services on the master server while working on the unrelated issue, it freed up those hung jobs and let me cancel them.