Forum Discussion

DoubleP's avatar
DoubleP
Level 5
6 years ago

Multiple job ids showing in NB master admin console gui: nfo bphdb (pid=0) done. status: 26: client/

Hello, we are using version 8.0; master servers are linux. In the Master admin console gui, we are seeing many job IDs without policy names or status. Clicking on them, if we look at Detailed Staus, we see:

Mar 3, 2019 8:01:56 PM - Info bpbrm (pid=2947) starting bphdb on client
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) cannot execute cmd on client .com
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) socket close failed, Bad file descriptor (9)
Mar 4, 2019 1:01:56 AM - Error bpbrm (pid=2947) could not write EXIT STATUS to OUTSOCK
Mar 4, 2019 1:01:56 AM - Info bphdb (pid=0) done. status: 26: client/server handshaking failed

They're from different clients, but all have that basic data. They're all from early March 2019. They're not clearing out; we're not sure if they're consuming resources. We cannot cancel them. Has anyone seen anything like this? I couldn't find the exact same situation, when I did a search.

  • Unfortunately, my "solution" is not a great discovery. Our guess is that when my coworker stopped/started nb services on the master server while working on the unrelated issue, it freed up those hung jobs and let me cancel them.

  • Try to stop the netbackup service from Master server. Run bp.kill_all from the Netbackup installation path and restart the service, let me know the status.

    • DoubleP's avatar
      DoubleP
      Level 5

      Arkya , won't that effect current jobs running, as opposed to those stuck in limbo status since March 4?

      • Arkya's avatar
        Arkya
        Level 4

        Yes, that will impact the current jobs as well, try the steps on non backup hours.

    • DoubleP's avatar
      DoubleP
      Level 5

      Good morning, when I left work last night, my coworker was on the phone with Veritas support for an unrelated issue (but on the same master server). He left me an email telling me that they fixed his issue. I checked the master gui, and noticed that all those limboed jobs were now showing "Waiting for retry" and "cancel" was no longer grayed out. I was able to cancel all those jobs that were there since March. When my coworker is back on Monday, I'll ask him what was done. Maybe they used the Veritas version of penicillin.

      • DoubleP's avatar
        DoubleP
        Level 5

        Unfortunately, my "solution" is not a great discovery. Our guess is that when my coworker stopped/started nb services on the master server while working on the unrelated issue, it freed up those hung jobs and let me cancel them.