12-18-2012 07:19 AM
We have just successfully installed NB 7.5.0.3 on 18 clients running HP-UX 11i. However despite following the same install procedure as before, the install on one particular client fails before it even gets to the NB portion....
x ./bp_servers, 33 bytes, 1 tape blocks
x ./bp_client_name, 23 bytes, 1 tape blocks
Installing PBX...
Please wait while installation is in progress...
Installation completed Successfully
Installation log located here: /var/tmp/installpbx-521-121712160139.log
The pbx_exchange daemon is not currently running. It
must be started before NetBackup installation can be continued.
Aborting ...
client_config_failed
When we try and start the PBX daemon by hand (/sbin/rc2.d/S450vxpbx_exchanged start) it simply exits with a return code of zero (ie. success) but there is no PBX daemon (/opt/VRTSpbx/bin/pbx_exchange) visible. I have looked in all the various log files but there is nothing to indicate why it fails to start the daemon. Can anyone help, thanks?
12-18-2012 07:25 AM
Is PBX already installed?
Maybe for other Symantec/Veritas products such as Storage Foundation?
12-18-2012 07:41 AM
Thanks for the quick reply... Apart from the following output from "swlist" (as the result of the PBX pre-install part of the unsuccessful HPUX NB 7.5.0.3 client install) what else should I be looking for?
VRTSpbx 1.5.1.5 Symantec Private Branch Exchange
I can't see anything much, for instance on another HPUX client...
swlist | egrep -i "vrt|pbx|sym" yields...
SYMCnbclt 7.5.0.3 NetBackup Client
SYMCnbjava 7.5.0.3 NetBackup Java Console
SYMCnbjre 7.5.0.0 NetBackup Java JRE
VRTSpbx 1.5.1.5 Symantec Private Branch Exchange
And none of this occurs on the problem box.
I have just removed "VRTSpbx" using "swremove VRTSpbx" and attempted another re-install. But again it fails with the same error message as before and PBX will not start manually.
12-18-2012 08:04 AM
What happens if you run: /opt/VRTSpbx/bin/vxpbx_exchanged start ??
Is anything else listening on port 1556 ?
12-18-2012 08:36 AM
I had a look at that after Googling... not that I can tell... For instance if telnet from another client to the bad host using 1556 it fails (even with PBX alegedly installed) whilst another HPUX client seems Ok...
root>telnet bad_client 1556
Trying...
telnet: Unable to connect to remote host: Connection refused
root>telnet good_client 1556
Trying...
Connected to maine.NTT1.lattice-group.com.
Escape character is '^]'.
telnet>
Is there another way of confirming no existing use of 1556 (hpux 11i (v1)) ?
12-18-2012 08:53 AM
I also tried the 1556 hypothesis after muchisimo Googling on this.
Another HPUX client recently installed successfully tested as follows (as expected)...
root> telnet angel 1556
Trying... Connected to angel.
Escape character is '^]'.
telnet> Connection closed.
Whereas the client we have problems on got this...
root >telnet rogue 1556
Trying... telnet: Unable to connect to remote host: Connection refused
The above occurred with PBX both installed and de-installed. Is there another (better) way of testing for a pre-existing 1556 port utilisation on HPUX 11i?
12-18-2012 09:38 AM
try: netstat -a | grep 1556
Also check in /etc/services. grep 1556 /etc/services
12-18-2012 09:36 PM
How "ping localhost 64 5" shows?
When the host can not resolve "localhost", PBX can not start if I remember right.
12-19-2012 02:05 AM
Thanks for the help, but no joy I'm afraid. There doesn't appear to be much sign of 1556 being locked by another service. I'd already tried a few of the tests and the extra ones suggested confirmed what I'd seen.
root:>netstat -a | grep 1556 # Nothing found
root:>grep 1556 /etc/services # Nothing found
root:>ping localhost 64 5 # Ping test appears Ok
PING localhost: 64 byte packets
64 bytes from 127.0.0.1: icmp_seq=0. time=0. ms
64 bytes from 127.0.0.1: icmp_seq=1. time=0. ms
64 bytes from 127.0.0.1: icmp_seq=2. time=0. ms
64 bytes from 127.0.0.1: icmp_seq=3. time=0. ms
64 bytes from 127.0.0.1: icmp_seq=4. time=0. ms
----localhost PING Statistics----
5 packets transmitted, 5 packets received, 0% packet loss
round-trip (ms) min/avg/max = 0/0/0
Is there any way of increasing the logging on the PBX to see what it might be doing?
root:> /sbin/rc2.d/S450vxpbx_exchanged start;echo $?;ps -ef|grep -i pbx;sleep 1;ps -ef|grep -i pbx
0
root 20444 18482 1 10:06:36 pts/1 0:00 grep -i pbx
root 20442 1 3 10:06:36 ? 0:00 /opt/VRTSpbx/bin/pbx_exchange
root 20447 18482 0 10:06:37 pts/1 0:00 grep -i pbx
As can be seen the restart simply returns a zero (ie. success) and although' the daemon exists momentarily it's not there a second later.
The logging is currently set as below...
root:>/opt/VRTSpbx/bin/pbxcfg -p
Auth User:0 : root
Secure Mode: false
Debug Level: 10
Port Number: 1556
PBX service is not cluster configured
The contents of the file "/opt/VRTSpbx/log/vrtspbx.log" after the re-install and attempted restart is...
****************
Preinstall log
****************
Wed Dec 19 09:51:05 UTC 2012
PBX process not found in running state.
Clean up the old "S65*pbx*" and "K65*pbx*"
****************
Postinstall log
****************
Wed Dec 19 09:51:11 UTC 2012
Make sym link to vxpbxcfg from pbxcfg.
Clean up the old "S65*pbx*" and "K65*pbx*"
Create start links in run levels 2-5.
Create stop links in run level 0 and 6.
Create link for man pages.
Create link for man pages.
Create a soft link for log dir.
Post install configuration
Removing the old entry50936
Making new entry50936
pbx specific logging configuration.
Parameter : DebugLevel Default value : 1 Found value :
Adding default value DebugLevel=1
Parameter : LogDirectory Default value : /var/log/VRTSpbx Found value :
Adding default value LogDirectory=/var/log/VRTSpbx
Parameter : L10nResourceDir Default value : /opt/VRTSpbx/resources Found value :
Adding default value L10nResourceDir=/opt/VRTSpbx/resources
Parameter : L10nResource Default value : VxPBX Found value :
Adding default value L10nResource=VxPBX
Parameter : MaxLogFileSizeKB Default value : 1024 Found value :
Adding default value MaxLogFileSizeKB=1024
Parameter : RolloverMode Default value : FileSize Found value :
Adding default value RolloverMode=FileSize
Parameter : NumberOfLogFiles Default value : 5 Found value :
Adding default value NumberOfLogFiles=5
Parameter : LogRecycle Default value : true Found value :
Adding default value LogRecycle=true
Parameter : LogFilePermissions Default value : 640 Found value : 640
updating VxPBX.cfg with older values
Wed Dec 19 09:51:20 UTC 2012
I have attached the PBX part of the client install to this post.
12-19-2012 02:10 AM
what does the pbx log show:
Try and start PBX, then run (this will show the last 10 mins of the log, relative to when you run the command)
vxlogview -p 50936 -o 103 -d all -t 00:10:00 >somefile.txt
M
12-19-2012 02:14 AM
what does the pbx log show:
Try and start PBX, then run (this will show the last 10 mins of the log, relative to when you run the command)
vxlogview -p 50936 -o 103 -d all -t 00:10:00 >somefile.txt
M
12-19-2012 02:23 AM
Can we take a step back - you are referring to 7.5.0.3 installation, which is a patch that is installed on top of 7.5.
Was 7.5 base software installed?
12-19-2012 03:18 AM
Nothing in the log I'm afraid...
root:>/sbin/rc2.d/S450vxpbx_exchanged start ; sleep 5
root:>/opt/VRTSpbx/bin/vxlogview -p 50936 -o 103 -d all -t 00:10:00
V-1-45 No log files found.
12-19-2012 03:31 AM
OK, lets try this from a different direction.
What's is the difference with this client compared to the others that are working, from the OS perspective.
For example, does it contain different software, os patches ?
Martin
12-19-2012 07:34 AM
"Can we take a step back - you are referring to 7.5.0.3 installation, which is a patch that is installed on top of 7.5.
Was 7.5 base software installed?"
No we installed 7.5.0.3 as per all the other HPUX clients, these all worked fine. The install were done from a 7.5.0.3 Media Server as follows...
cd /usr/openv/netbackup/client/HP9000-800/HP-UX11.11
./ftp_to_client <server> phs01
If the basic install approach above was fundamentally flawed I'd guess that the 20 other HPUX clients we have installed and got working would have also had issues?
12-19-2012 11:21 PM
Ok, so this is a push-install from server.
Nothing wrong with that - we just don't see it too often in this forum.
As per Martin's post above - there is clearly something different about this client (OS level) or with the ftp'ed file that landed on the client.
12-20-2012 01:54 AM
I'm currently creating an Excel to identify the differences between the problem host and the 20 or so others with regard to s/w patches etc.
To my mind it's got to be something "non-standard" on the problem host as the ftp'd set of files is identical each time and the push/install is as per Symantec documentation. There's nothing I can see (?) in the install log that's failing from a PBX install view point and the same install methodology works on all the other hosts.
12-20-2012 02:28 AM
Maybe check list of required Operating system patches and updates in NBU 7.5 Release Notes.
There is quite a long list for HP-UX 11.11...
12-20-2012 02:33 AM
You could try a 'tusc' on the pbx process - like this (apologies, I have given the example using truss)
12-20-2012 07:05 AM
I installed "tusc" on two clients which were working OK and the one which wasn't and the only differences in the output (give or take a few Hex / memory offset changes) are the following...
Good:
open("/opt/CA/CAlib/libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) ............................................ ERR#2 ENOENT
open("/opt/CA/SharedComponents/lib/libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) ............................. ERR#2 ENOENT
open("/opt/VRTSpbx/bin//libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) ........................................ ERR#2 ENOENT
open("/opt/VRTSpbx/bin/../lib/libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) .................................. = 4
Bad:
open("/opt/VRTSpbx/bin//libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) ........................................ ERR#2 ENOENT
open("/opt/VRTSpbx/bin/../lib/libvxPBXUtility.sl", O_RDONLY|O_LARGEFILE, 0) .................................. = 4
The "Good" clients make 2 references to call stuff within "/opt/CA/..." which register an Error (?) before succeeding with a call to "/opt/VRTSpbx/bin/libvx/..." . I'm guessing this might be a red herring due to the s/w differences between the hosts... I'm still to create the s/w difference Excel....
However near the end the tusc log files it appears to give an indication that the kernel setup on both hosts may not be identical
Bad:
sysconf(_SC_THREAD_THREADS_MAX) .............................................................................. = 1024
sysconf(_SC_THREAD_STACK_MIN) ................................................................................ = 4096
mpctl(MPC_GETNUMSPUS_SYS, 0, 0) .............................................................................. = 3
Good:
sysconf(_SC_THREAD_THREADS_MAX) .............................................................................. = 3000
sysconf(_SC_THREAD_KEYS_MAX) ................................................................................. = 256
sysconf(_SC_THREAD_STACK_MIN) ................................................................................ = 4096
mpctl(MPC_GETNUMSPUS_SYS, 0, 0) .............................................................................. = 4
However another "Good" host has...
sysconf(_SC_THREAD_THREADS_MAX) .............................................................................. = 1024
sysconf(_SC_THREAD_KEYS_MAX) ................................................................................. = 256
sysconf(_SC_THREAD_STACK_MIN) ................................................................................ = 4096
mpctl(MPC_GETNUMSPUS_SYS, 0, 0) .............................................................................. = 4
So the significant difference remains...
sysconf(_SC_THREAD_KEYS_MAX) ................................................................................. = 256
mpctl(MPC_GETNUMSPUS_SYS, 0, 0) .............................................................................. = 4
"SC_THREAD_KEYS_MAX" at "256" does not exist on the "Bad" host and the setting for "MPC_GETNUMSPUS_SYS" on the "Bad" host is one less than the "good" one. Could it be a resource issue dictated by kernal parameters? From a kernal perspective the differences only appear to be this...
Good Bad
max_thread_proc 3000 1024
maxdsiz 2063833536 1073741824
maxusers 512 256
nclist 8292 4196
timeslice 20 10
We also tried a reboot this morning (out of desperation) as the "bad" host had been up for about one year, but that didn't fix it either.