cancel
Showing results for 
Search instead for 
Did you mean: 

bprd daemon not running on the master server - Solaris 10

Ram_EUHR
Level 4

Our backups-master server was un-responsive due to resource constraints and we had to restart the server to bring back to working state.

Post the reboot, netbackup services aren't running properly on the server.

Output of /usr/openv/netbackup/bin/bpps -a:

 

NB Processes
------------
    root 16775     1   0 07:12:27 ?           0:00 /usr/openv/netbackup/bin/nbsvcmon
    root 18128 18127   0 07:27:32 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -PolicyManager-2-1-145346565
    root 19173 19172   0 07:40:28 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -StorageService-1-2-14534664
    root  4024  4022   0 05:23:45 ?           0:00 /usr/openv/netbackup/bin/bpjava-susvc easwaraa -1 -1 C /usr/openv/java/auth.con
    root 19319 19318   0 07:42:04 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -CatalogManager-1-4-14534665
    root 19318 16739   0 07:42:04 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -CatalogManager-1-4-
    root 19172 16739   0 07:40:28 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -StorageService-1-2-
    root 16655     1   0 07:12:02 ?           0:00 /usr/openv/pdde/pdag/bin/mtstrmd
    root 19208 16739   0 07:40:44 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -PolicyManager-1-3-1
    root 19621 16739   0 07:46:01 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -ClientManager-1-5-1
    root 19622 19621   0 07:46:01 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -ClientManager-1-5-145346676
    root  4022     1   0 05:23:45 ?           0:00 /usr/openv/netbackup/bin/bpjava-susvc easwaraa -1 -1 C /usr/openv/java/auth.con
easwaraa  4016  3979   9 05:23:32 pts/3     138:05 /fs/local-1/openv/java/jre/bin/amd64/java -Dvrts.NBJAVA_CONF=/usr/openv/java/nb
    root  4026  4022   0 05:23:46 ?           0:00 /usr/openv/netbackup/bin/bpjava-susvc easwaraa -1 -1 C /usr/openv/java/auth.con
    root 19209 19208   0 07:40:44 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -PolicyManager-1-3-145346644
    root 16464     1   0 07:11:58 ?           0:00 /usr/openv/netbackup/bin/private/nbatd -c /usr/openv/var/global/vxss/eab/data
    root 16470     1   0 07:11:59 ?           0:00 /usr/openv/netbackup/bin/bpcd -standalone
    root 19676 19675   0 07:46:20 ?           0:00 /usr/openv/netbackup/bin/nbproxy dblib -mgrIORFile -HPManager-1-6-1453466780.io
    root  4658  4022   0 05:29:39 ?           0:00 /usr/openv/netbackup/bin/bpjava-susvc easwaraa -1 -1 C /usr/openv/java/auth.con
    root 19675 16739   0 07:46:20 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -HPManager-1-6-14534
    root 16467     1   0 07:11:59 ?           0:00 /usr/openv/netbackup/bin/vnetd -standalone
    root 16716     1   0 07:12:05 ?           0:01 /usr/openv/netbackup/bin/nbrmms
    root 18127 16739   0 07:27:32 ?           0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib -mgrIORFile -PolicyManager-2-1-1
    root 16768     1   0 07:12:26 ?           0:00 /usr/openv/netbackup/bin/nbvault
    root 16739     1   0 07:12:05 ?           0:03 /usr/openv/netbackup/bin/nbsl


MM Processes
------------
    root 16683     1   0 07:12:03 ?           0:01 /usr/openv/volmgr/bin/ltid
    root 16689     1   0 07:12:03 ?           0:01 vmd -v
 

Found that bprd daemon (and bpdbm/bpdbjobs) isn't running and tried starting it via the startup script '/usr/openv/netbackup/bin/initbprd' with no luck... 

From the bprd logs, got the below message:

log.012216:05:03:40.924 [1245] <2> bprd: <master server> is not the primary server <master server>...exiting

log.012216:06:10:24.152 [9473] <2> bprd: <master server> is not the primary server <master server>...exiting

log.012216:06:10:57.906 [9855] <2> bprd: <master server> is not the primary server <master server>...exiting

log.012216:06:11:03.931 [9868] <2> bprd: <master server> is not the primary server <master server>...exiting

log.012216:06:11:05.394 [9875] <2> bprd: <master server> is not the primary server <master server>...exiting

 

Tried searching through the community articles but couldn't find anything relavant.

Please let us know if any of you have faced this issue before.

 

Our OS version - Solaris 10

Netbackup version - v7.6.2

 

Thanks,

Ram Kumar

4 REPLIES 4

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Check name resolution, foward and reverse nslookup, forward and reverse bpclntcmd, and finally (which usually fixes it) bpclntcmd -clear_host_cache.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Check output of 'hostname'. Compare with 1st entry in bp.conf. They need to be the same. 1st entry MUST be SERVER = 'master-hostname ' Same for EMMSERVER entry. I have seen how things get changed on a master server and then only causes an issue when the server is rebooted. You need to find out what has changed - server hostname or bp.conf?

Ram_EUHR
Level 4

This was an issue with corrupted VXDBMS.conf file. I logged a support case to get this fixed. 

FWIW, fix is documented in https://www.veritas.com/support/en_US/article.TECH71702. Just follow the steps and restart the netbackup services from a root shell.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Thanks for the feedback. The solution does not explain the error in bprd log?