cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with Device configuration wizard after upgrade 7.7.1

Sagar_Kolhe
Level 6

Hello,

I am facing one issue with device configuration after upgrade from 7.6.1.2 to 7.7.1. When we starting the device configuration wizards , we are getting below error.

unable to retreive the list of devices hosts for Media Manager host Master Server Name.

Is there any changes happen in 7.7?

We are working in cluster environment. Please provide your views to resolve this issue.

We have upgraded only the master server not other media servers(Robot control Host)

 

Thanks,

Sagar

1 ACCEPTED SOLUTION

Accepted Solutions

sdo
Moderator
Moderator
Partner    VIP    Certified

Now, I know you know NetBackup... and so I don't mean to teach you to suck eggs...

...but sometimes we can't see the wood for the trees, and yes it has happened to me more than once too...

...anyway, here's what I would do in an attempt to completely rule out naming issues...

...and yes, the commands which are done twice, need to be done twice in order to spot rotating DNS and or name entries...

...so, on each NetBackup Server node (i.e. all masters and all medias, whether clustered or not, and whether active or passive), do the following:

blclntcmd -clear_host_cache

cat /etc/hosts

cat /etc/resolv.conf

egrep -v "^#|^$" /etc/nsswitch.conf

egrep -i "client|server" /usr/openv/netbackup/bp.conf

cat /usr/openv/volmgr/vm.conf

cat /usr/openv/var/global/server.conf

hostname

nslookup $HOSTNAME

nslookup $HOSTNAME

bptestnetconn -v -a -s

bpclntcmd -self

bpclntcmd -self

bpclntcmd -hn $HOSTNAME

bpclntcmd -hn $HOSTNAME

bpclntcmd -pn

nbemmcmd -getemmserver

nbemmcmd -listhosts

# ...then for the IP address that you think the is the NetBackup IP for the host that you are own, yes do it twice, do:

nslookup x.x.x.x

nslookup x.x.x.x

bpclntcmd -ip x.x.x.x

bpclntcmd -ip x.x.x.x

# [end]

...so do all the above on every NetBackup "Server"  (yes I know the cat of the server.conf should fail on media servers, but think about it for a sec ;)

.

Finally, please can you describe how and where you are running the Java Admin console from?

1) OS version?

2) Are you using jnbSA back via X11 over ssh ?

3) If using Java Admin console on Windows, then definitely started using "Run as administrator" ?

4) On the admin console host (desktop/server) can you run the following:

...if the java admin host is Unix/Linux:

uname -a

hostname

id

cat /etc/hosts

cat /etc/resolv.conf

egrep -v "^#|^$" /etc/nsswitch.conf

ifconfig

nslookup $HOSTNAME

nslookup $HOSTNAME

nslookup x.x.x.x

nslookup x.x.x.x

...

...if the java admin host is Windows:

ver

hostname

whoami

type C:\Windows\System32\drivers\etc\hosts

ipconfig /all

nslookup %computername%

nslookup %computername%

nslookup x.x.x.x

nxlookup x.x.x.x

.

And after all this there may still be routing issues to sort out.

View solution in original post

14 REPLIES 14

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Why do you need to run Device Config wizard after NBU upgrade? What is the output of these commands : nbemmcmd -listhosts -verbose nbemmcmd -getemmserver

nbutech
Level 6
Accredited Certified

 

Is there any connectivity issues you are seeing between the master and media server ?

Check if the communications looks good between them 

 

Sagar_Kolhe
Level 6

Hello,

Communication is happening properly with media servers(robot hosts) but for some server we are unable to telnet on 13701 port. But for the same media servers vmoprcmd output is working fine.

abc1
        MachineName = "abc1"
        FQName = "abc1"
        MachineDescription = ""
        MachineNbuType = server (6)
efgh
        ClusterName = "abc1"
        MachineName = "efgh"
        FQName = "efgh"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 0xe7
        MachineNbuType = master (3)
        MachineState = active for tape and disk jobs (14)
        NetBackupVersion = 7.7.1.0 (771000)
        OperatingSystem = solaris (2)
        ScanAbility = 5
abcd
        ClusterName = "abc1"
        MachineName = "abcd"
        FQName = "abcd"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 0xe7
        MachineNbuType = master (3)
        MachineState = active for disk jobs (12)
        NetBackupVersion = 7.7.1.0 (771000)
        OperatingSystem = solaris (2)
        ScanAbility = 5
Command completed successfully.

 

nbemmcmd -getemmserver === this command will take more time to complete.

 

sdo
Moderator
Moderator
Partner    VIP    Certified

Without some detail around actual master server name and media server names and any description of which server you executed those commands upon, then well, we remain blind.  It's impossible to tell what the above is actually about.  :(

Sagar_Kolhe
Level 6

oops !!!

 

dakcmstnetbkp
        MachineName = "dakcmstnetbkp"
        FQName = "dakcmstnetbkp"
        MachineDescription = ""
        MachineNbuType = server (6)
hbdakcmstn2
        ClusterName = "dakcmstnetbkp"
        MachineName = "hbdakcmstn2"
        FQName = "hbdakcmstn2"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 0xe7
        MachineNbuType = master (3)
        MachineState = active for tape and disk jobs (14)
        NetBackupVersion = 7.7.1.0 (771000)
        OperatingSystem = solaris (2)
        ScanAbility = 5
hbdakcmstn1
        ClusterName = "dakcmstnetbkp"
        MachineName = "hbdakcmstn1"
        FQName = "hbdakcmstn1"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 0xe7
        MachineNbuType = master (3)
        MachineState = active for disk jobs (12)
        NetBackupVersion = 7.7.1.0 (771000)
        OperatingSystem = solaris (2)
        ScanAbility = 5
Command completed successfully.
 

Sagar_Kolhe
Level 6

Now we are facing one other issue with robots. PFA snapshot.

Sagar_Kolhe
Level 6

PFA bpjava-msvc logs.

 

00:01:33.428 [8272] <16> session_dispatch: Request count = 0 tag = 510
00:01:33.444 [8272] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:01:33.445 [8272] <4> command_SECURE_CHANNEL_INIT: Using certificate [/usr/openv/var/vxss/credentials/dakcmstnetbkp] and Responding SECURE_CHANNEL_PROCEED.
00:01:33.707 [8272] <4> session_secure_lookup: Initiating SSL Accept
00:01:33.708 [8272] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:01:33.756 [8272] <4> tls_accept: io.c.3330: SSL Channel established for fd[0]
00:01:33.756 [8272] <4> session_secure_lookup: SSL Connection Accepted!
00:01:33.766 [8272] <16> session_dispatch: Request count = 1 tag = 118
00:01:34.031 [8272] <16> session_dispatch: Request count = 2 tag = 101
00:01:34.148 [8272] <8> retry_getnameinfo: [vnet_addrinfo.c:1146] getnameinfo() failed 8 0x8
00:01:34.148 [8272] <8> vnet_cached_getnameinfo: [vnet_addrinfo.c:1917] retry_getnameinfo() failed RV=8 IPSTR=10.226.34.166 PORT=56081
00:01:34.177 [8272] <16> newAuthenticate: bad pam_status 3, Error in underlying service module
00:01:34.179 [8272] <16> command_LOGON_TO_MSERVER: putenv(BPJAVA_MASTER_IPC_STRING=) failed
00:01:34.210 [8277] <16> isVxssActive: authentication determination failed, assume none required: (193) VxSS authentication is requested but not allowed
00:01:34.211 [8277] <2> establishAuthorization: CORBA_CONTEXT_ID_AUTH is enabled: 1
00:01:34.211 [8277] <4> establishAuthorization: NON-NBAC user cert generation.
00:01:34.223 [8277] <2> userCertLogin: Trying to generate user cert in non-NBAC host
00:01:34.224 [8277] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1601] in failed file cache ERR=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=NULL
00:01:34.224 [8277] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1270] vnet_cached_getaddrinfo_and_update() failed 6 0x6
00:01:34.224 [8277] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2867] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=dakcmstnetbkp.hdfcsec.com
00:01:34.229 [8277] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1547] in failed cache ERR=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=NULL
00:01:34.229 [8277] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1270] vnet_cached_getaddrinfo_and_update() failed 6 0x6
00:01:34.229 [8277] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2867] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=dakcmstnetbkp.hdfcsec.com
00:01:34.229 [8277] <8> vnet_getfqdn: [vnet_fqdn.c:163] vnet_cached_get_aliases returned no valid aliases dakcmstnetbkp.HBCTXDOM.COM
00:01:34.229 [8277] <8> vnet_getfqdn: [vnet_fqdn.c:168] hostptr appears good - using it dakcmstnetbkp.HBCTXDOM.COM
00:01:34.847 [8277] <2> userCertLogin: VssExportCredential() returned [0]
00:01:34.847 [8277] <4> userCertLogin: Login PASSED and user cert [PASSED]
00:01:34.847 [8277] <2> establishAuthorization: User Certificate generated at //.vxss/credentials
00:01:43.107 [8272] <16> readCharByChar: read failed(?), sockFd = 0, count = 0 (want = 3072),  errno = 131 = Connection reset by peer
00:01:43.107 [8272] <16> session_dispatch: recv failed, sockFd = 0, errno = 131 = Connection reset by peer
00:01:44.111 [8270] <16> poll_listen: can't find file descriptor in polling table
00:54:11.633 [26809] <16> session_dispatch: Request count = 0 tag = 510
00:54:11.673 [26809] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:54:11.673 [26809] <4> command_SECURE_CHANNEL_INIT: Using certificate [/usr/openv/var/vxss/credentials/dakcmstnetbkp] and Responding SECURE_CHANNEL_PROCEED.
00:54:11.943 [26809] <4> session_secure_lookup: Initiating SSL Accept
00:54:11.944 [26809] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:54:12.065 [26809] <4> tls_accept: io.c.3330: SSL Channel established for fd[0]
00:54:12.065 [26809] <4> session_secure_lookup: SSL Connection Accepted!
00:54:12.081 [26809] <16> session_dispatch: Request count = 1 tag = 118
00:54:12.389 [26809] <16> session_dispatch: Request count = 2 tag = 101
00:54:12.507 [26809] <8> retry_getnameinfo: [vnet_addrinfo.c:1146] getnameinfo() failed 8 0x8
00:54:12.507 [26809] <8> vnet_cached_getnameinfo: [vnet_addrinfo.c:1917] retry_getnameinfo() failed RV=8 IPSTR=10.226.34.166 PORT=56302
00:54:12.592 [26809] <16> newAuthenticate: bad pam_status 3, Error in underlying service module
00:54:12.595 [26809] <16> command_LOGON_TO_MSERVER: putenv(BPJAVA_MASTER_IPC_STRING=) failed
00:54:12.649 [26813] <16> isVxssActive: authentication determination failed, assume none required: (193) VxSS authentication is requested but not allowed
00:54:12.650 [26813] <2> establishAuthorization: CORBA_CONTEXT_ID_AUTH is enabled: 1
00:54:12.651 [26813] <4> establishAuthorization: NON-NBAC user cert generation.
00:54:12.663 [26813] <2> userCertLogin: Trying to generate user cert in non-NBAC host
00:54:12.676 [26813] <8> retry_getaddrinfo_for_real: [vnet_addrinfo.c:1053] getaddrinfo() failed RV=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=0 errno=0
00:54:12.676 [26813] <8> retry_getaddrinfo: [vnet_addrinfo.c:893] retry_getaddrinfo_for_real failed RV=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=0
00:54:12.676 [26813] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1642] retry_getaddrinfo() failed RV=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=NULL
00:54:12.681 [26813] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1270] vnet_cached_getaddrinfo_and_update() failed 6 0x6
00:54:12.681 [26813] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2867] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=dakcmstnetbkp.hdfcsec.com
00:54:12.700 [26813] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1547] in failed cache ERR=8 NAME=dakcmstnetbkp.hdfcsec.com SVC=NULL
00:54:12.700 [26813] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1270] vnet_cached_getaddrinfo_and_update() failed 6 0x6
00:54:12.700 [26813] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2867] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=dakcmstnetbkp.hdfcsec.com
00:54:12.700 [26813] <8> vnet_getfqdn: [vnet_fqdn.c:163] vnet_cached_get_aliases returned no valid aliases dakcmstnetbkp.HBCTXDOM.COM
00:54:12.700 [26813] <8> vnet_getfqdn: [vnet_fqdn.c:168] hostptr appears good - using it dakcmstnetbkp.HBCTXDOM.COM
00:54:13.486 [26813] <2> userCertLogin: VssExportCredential() returned [0]
00:54:13.486 [26813] <4> userCertLogin: Login PASSED and user cert [PASSED]
00:54:13.486 [26813] <2> establishAuthorization: User Certificate generated at //.vxss/credentials
00:54:22.660 [26809] <16> readCharByChar: read failed(?), sockFd = 0, count = 0 (want = 3072),  errno = 131 = Connection reset by peer
00:54:22.660 [26809] <16> session_dispatch: recv failed, sockFd = 0, errno = 131 = Connection reset by peer
00:54:23.664 [26805] <16> poll_listen: can't find file descriptor in polling table
 

sdo
Moderator
Moderator
Partner    VIP    Certified

On both master server cluster member nodes try this to clear out corrupt NetBackup name cache:

bpclntcmd -clear_host_cache

...then on the active master server member node try these query commands:

nbemmcmd -getemmserver

nbemmcmd -listhosts

vmoprcmd -devmon hs

Then try restarting the Java Admin Console via the active master server member node.

Sagar_Kolhe
Level 6

Hi,

 

Every command is working fine and I have already tried above suggestions.

 

Have u checked above logs. We have 2 cluster environments out of that 1 is working fine.

sdo
Moderator
Moderator
Partner    VIP    Certified

Wondering whether we have a naming issue:

00:01:34.229 [8277] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2867] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=dakcmstnetbkp.hdfcsec.com
00:01:34.229 [8277] <8> vnet_getfqdn: [vnet_fqdn.c:163] vnet_cached_get_aliases returned no valid aliases dakcmstnetbkp.HBCTXDOM.COM

The same hostname has two different suffixes.

I don't understand how they are related to each other, so I don't know if it's a problem or not.

The status <8> above might just be a red herring.

.

The status <16> however, these are a problem... if we filter out just the entries for thread 26809:

00:54:11.633 [26809] <16> session_dispatch: Request count = 0 tag = 510
00:54:11.673 [26809] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:54:11.673 [26809] <4> command_SECURE_CHANNEL_INIT: Using certificate [/usr/openv/var/vxss/credentials/dakcmstnetbkp] and Responding SECURE_CHANNEL_PROCEED.
00:54:11.943 [26809] <4> session_secure_lookup: Initiating SSL Accept
00:54:11.944 [26809] <2> populateCertificatePath: Certificate to be used for SSL [/usr/openv/var/vxss/credentials/dakcmstnetbkp]
00:54:12.065 [26809] <4> tls_accept: io.c.3330: SSL Channel established for fd[0]
00:54:12.065 [26809] <4> session_secure_lookup: SSL Connection Accepted!
00:54:12.081 [26809] <16> session_dispatch: Request count = 1 tag = 118
00:54:12.389 [26809] <16> session_dispatch: Request count = 2 tag = 101
00:54:12.507 [26809] <8> retry_getnameinfo: [vnet_addrinfo.c:1146] getnameinfo() failed 8 0x8
00:54:12.507 [26809] <8> vnet_cached_getnameinfo: [vnet_addrinfo.c:1917] retry_getnameinfo() failed RV=8 IPSTR=10.226.34.166 PORT=56302
00:54:12.592 [26809] <16> newAuthenticate: bad pam_status 3, Error in underlying service module
00:54:12.595 [26809] <16> command_LOGON_TO_MSERVER: putenv(BPJAVA_MASTER_IPC_STRING=) failed
00:54:22.660 [26809] <16> readCharByChar: read failed(?), sockFd = 0, count = 0 (want = 3072),  errno = 131 = Connection reset by peer
00:54:22.660 [26809] <16> session_dispatch: recv failed, sockFd = 0, errno = 131 = Connection reset by peer

We see a status <8> getnameinfo failed immediately before some status <16> errors.

.

Are you sure that your NetBackup server naming is robust/good?

sdo
Moderator
Moderator
Partner    VIP    Certified

This poster:

https://www.veritas.com/community/forums/admin-console-opening-backup-archive-and-restore

...who also had:

newAuthenticate: bad pam_status 3, Error in underlying service module

...found that they had been attempting to logon to the wrong server name.

Sagar_Kolhe
Level 6

Yeah ... According to me naming convention is good. Backups also happening through Master successfully. However If I want ot check the issue with the name , how can I proceed ?

 

Thanks,

Sagar

sdo
Moderator
Moderator
Partner    VIP    Certified

Now, I know you know NetBackup... and so I don't mean to teach you to suck eggs...

...but sometimes we can't see the wood for the trees, and yes it has happened to me more than once too...

...anyway, here's what I would do in an attempt to completely rule out naming issues...

...and yes, the commands which are done twice, need to be done twice in order to spot rotating DNS and or name entries...

...so, on each NetBackup Server node (i.e. all masters and all medias, whether clustered or not, and whether active or passive), do the following:

blclntcmd -clear_host_cache

cat /etc/hosts

cat /etc/resolv.conf

egrep -v "^#|^$" /etc/nsswitch.conf

egrep -i "client|server" /usr/openv/netbackup/bp.conf

cat /usr/openv/volmgr/vm.conf

cat /usr/openv/var/global/server.conf

hostname

nslookup $HOSTNAME

nslookup $HOSTNAME

bptestnetconn -v -a -s

bpclntcmd -self

bpclntcmd -self

bpclntcmd -hn $HOSTNAME

bpclntcmd -hn $HOSTNAME

bpclntcmd -pn

nbemmcmd -getemmserver

nbemmcmd -listhosts

# ...then for the IP address that you think the is the NetBackup IP for the host that you are own, yes do it twice, do:

nslookup x.x.x.x

nslookup x.x.x.x

bpclntcmd -ip x.x.x.x

bpclntcmd -ip x.x.x.x

# [end]

...so do all the above on every NetBackup "Server"  (yes I know the cat of the server.conf should fail on media servers, but think about it for a sec ;)

.

Finally, please can you describe how and where you are running the Java Admin console from?

1) OS version?

2) Are you using jnbSA back via X11 over ssh ?

3) If using Java Admin console on Windows, then definitely started using "Run as administrator" ?

4) On the admin console host (desktop/server) can you run the following:

...if the java admin host is Unix/Linux:

uname -a

hostname

id

cat /etc/hosts

cat /etc/resolv.conf

egrep -v "^#|^$" /etc/nsswitch.conf

ifconfig

nslookup $HOSTNAME

nslookup $HOSTNAME

nslookup x.x.x.x

nslookup x.x.x.x

...

...if the java admin host is Windows:

ver

hostname

whoami

type C:\Windows\System32\drivers\etc\hosts

ipconfig /all

nslookup %computername%

nslookup %computername%

nslookup x.x.x.x

nxlookup x.x.x.x

.

And after all this there may still be routing issues to sort out.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Have another look at nbemmcmd -listhosts -verbose output. I see only master server nodes - no media servers. Why is that? What does nbemmcmd -getemmserver show? If nbemmcmd does not show all media servers, I am wondering about your upgrade method... PS: It was many, many versions ago that NBU needed port 13701 for vmd comms. All that is needed now is PBX (1556) between master and media servers.