08-02-2020 06:16 AM - edited 08-02-2020 06:45 AM
Hello
Following a fresh NetBackup installation, I'm having the below error message:
[root@nbo ~]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver
NBEMMCMD, Version: 8.1.1
The function returned the following failure status:
invalid host name (136)
Command did not complete successfully.
Below some information and related logs:
NetBackup: 8.1.1
RHEL: 7.6
HW: VM with 4 vCPU ; 16GB RAM
uname -a: Linux nbo 3.10.0-957.el7.x86_64 GNU/Linux
NBEMM logs attached.
Any help would be appreciated.
Solved! Go to Solution.
08-10-2020 12:13 PM
Shared memory configuration (kernel.shm***) inside sysctl.conf was not suitable for vmd process.
Leaving the default values fixed the issue.
08-02-2020 12:11 PM
Ensure EMMSERVER = entry in bp.conf is properly updated with EMM server/Master server name & also ensure media server is registered in EMM database, if you are running the above command on Media server. you can validate it by running the same command
/usr/openv/netbackup/bin/admincmd/nbemmcmd -listhosts
08-02-2020 11:08 PM
Hello
In fact there only one master server in my configuration, no additional media server.
Here is below the requested output.
[root@nbo ~]# cat /usr/openv/netbackup/bp.conf
SERVER = nbo
CLIENT_NAME = nbo
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
EMMSERVER = nbo
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /usr/openv/db/data
CLIENT_CONNECT_TIMEOUT = 20000000
WEBSVC_GROUP = nbwebgrp
WEBSVC_USER = nbwebsvc
TELEMETRY_UPLOAD = YES
[root@nbo ~]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -listhosts
NBEMMCMD, Version: 8.1.1
The following hosts were found:
server nbo
Command completed successfully.
08-03-2020 07:24 AM
Have you seen these lines in the Emm log?
Machine name nbo not found,33:ResourceEventMgr_i::resolveServer,1
Machine nbo could not be resolved;
Can you confirm that server networking is installed/enabled on this server?
What is in /etc/hosts on this server?
Output of 'ip link show' ?
08-03-2020 08:14 AM
Hello
Yes, I've noticed this message.
Here are the requested output.
[root@nbo ~]# more /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.1.10.100 nbo nbo.int.com
192.168.17.7 nbomgt
[root@nbo ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a9:fb:ae brd ff:ff:ff:ff:ff:ff
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a9:af:22 brd ff:ff:ff:ff:ff:ff
However I noticed that the prerequisites for NetBackup installation on RHEL include "server networking", but couldn't understand what does it mean exactly (do they mean some special package? package group? or something else?)
08-03-2020 02:21 PM
can you give the outputs of
bpclntcmd -pn
bpclntcmd -self
bpclntcmd -ip 10.1.10.100
bpclntcmd -hn nbo
also what is the CA cetificate status, you can get that from below command
Windows: <install_path>\netbackup\bin\nbcertcmd -displayCACertDetail -server mymaster
UNIX: <install_path>/netbackup/bin/nbcertcmd -displayCACertDetail -server mymaster
08-03-2020 02:54 PM
Please check below:
[root@nbo bin]# ./bpclntcmd -pn
expecting response from server nbo
nbo *NULL* 10.1.10.100 54586
[root@nbo bin]# ./bpclntcmd -self
yp_get_default_domain failed: (12) Local domain name not set
NIS does not seem to be running: (1) Request arguments bad
gethostname() returned: nbo
host nbo: nbo at 10.1.10.100
aliases: nbo 10.1.10.100
getfqdn(nbo) returned: nbo.int.com
[root@nbo bin]# ./bpclntcmd -ip 10.1.10.100
host 10.1.10.100: nbo at 10.1.10.100
aliases: nbo 10.1.10.100
[root@nbo bin]# ./bpclntcmd -hn nbo
host nbo: nbo at 10.1.10.100
aliases: nbo 10.1.10.100
[root@nbo bin]# ./nbcertcmd -displayCACertDetail -server nbo
CA Certificate received successfully from server nbo.
Subject Name : /CN=nbatd/OU=root@nbo.int.com/O=vx
Start Date : Aug 03 20:30:05 2020 GMT
Expiry Date : Jul 29 21:45:05 2040 GMT
SHA1 Fingerprint : 5F:1C:C6:EE:F6:54:F2:50:83:E5:0B:54:05:FC:E2:3C:C8:3F:DE:CD
CA Certificate State : Trusted
08-05-2020 05:46 AM
Hello
I've made two new installation to test the behavior on RHEL 7.8 and RHEL 6.8, and found that it behaves just fine on 6.8.
Please see below:
--------------- RHEL 6.8
[root@rhel68 ~]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver
NBEMMCMD, Version: 8.1.1
These hosts were found in this domain: rhel68.int.com
Checking with the host "rhel68.int.com"...
Server Type Host Version Host Name EMM Server
MASTER 8.1 rhel68.int.com rhel68.int.com
Command completed successfully.
--------------- RHEL 7.8
[root@rhel78 ~]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver
NBEMMCMD, Version: 8.1.1
The function returned the following failure status:
invalid host name (136)
Command did not complete successfully.
Any adea?
08-05-2020 06:33 AM
Have you checked if all NBU processes are running on the 7.x OS version?
Including PBX?
I just remembered something about PBX not automatically starting on RHEL 7.
Please see this article:
https://www.veritas.com/support/en_US/article.100031908
08-05-2020 07:46 AM
That was a good idea to check running processes..
Found 3 following processes not started on RHEL 7 compared to RHEL 6:
/usr/openv/netbackup/bin/nbpem
/usr/openv/netbackup/bin/nbproxy dblib nbpem
MM Processes
------------
root 13868 1 0 15:04 pts/0 00:00:00 vmd
PBX is starting correctly.
nbpem seems trying to start every minute but fails to do. Below some related logs from nbpem:
2,51216,116,116,32,1596552132262,6688,140221780092672,0:,0:,0:,2,(363|i32:254|S35:Master server is not defined in EMM|)
08-05-2020 08:05 AM - edited 08-05-2020 08:14 AM
nbpem lists the same error as EMM log.
So, something very weird here.
What is the output of
nbemmcmd -getemmserver ?
Any clues in /usr/openv/db/log/server.log ?
**** One more question :
Why such a large timeout value in bp.conf?
CLIENT_CONNECT_TIMEOUT = 20000000
In my 21 years with NetBackup, I have NEVER seen a need for such a long timeout value.
I have seen excessive timeouts causing massive problems.
* EDIT *
I see you have given us output of nbemmcmd -getemmserver -something is very wrong here:
[root@rhel78 ~]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver
NBEMMCMD, Version: 8.1.1
The function returned the following failure status:
invalid host name (136)
Command did not complete successfully.
Maybe good idea to log a call with Veritas Support.
08-05-2020 08:18 AM
nbemmcmd shows the same error as before..
[root@rhel78 log]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver
NBEMMCMD, Version: 8.1.1
The function returned the following failure status:
invalid host name (136)
Command did not complete successfully.
Nothing abnormal in server.log
08-05-2020 08:24 AM
CLIENT_CONNECT_TIMEOUT = 20000000
This is just a test server for troubleshooting my EMM issue.
08-05-2020 12:30 PM
i see you are trying install Netbackup 8.1.1 ( released in 2018 ) on RHEL 7.8 ( release on march 2020),
there may be a compatablity issue or may not as its not a major relese, however i would suggest you to try with 8.3 on 7.8 release.
08-10-2020 12:13 PM
Shared memory configuration (kernel.shm***) inside sysctl.conf was not suitable for vmd process.
Leaving the default values fixed the issue.
08-11-2020 04:13 AM
If the issue was with shared memory, then I find it strange that nothing was logged in server.log....
I'm curious to know how you found the issue?
What was the incorrect entries and what is working fine now?
Care to share with this community?
08-11-2020 06:22 AM
As reported by bpps, vmd was always down, so trying to run it manually and got the following output:
[root@rhel78 ~]# /usr/openv/volmgr/bin/vmd
malloc: Invalid argument (22)
malloc error, so it had certainly something to do with memory limitation.. Checked my sysctl.conf and found the following:
kernel.shmmax = 687 19476736
Notice the space (introduced by mistake) spliting the number in two parts. This sets shmmax to 687, which is too much low. Correcting the number fixed the issue.
08-11-2020 06:44 AM
Awesome! Thanks for the feedback, as this might help someone else with similar issues.