Error During VCS configuration

Anish

Level 4

13 years ago

Hi All,

Thanks for your responces, as suggested i tried to configure VCS manually

Following steps are performed :

on Dev node:

bash-3.00# cat /etc/llttab

set-node prod

set-cluster 101

link e1000g2 /dev/e1000g:2 - ether - -

link e1000g3 /dev/e1000g:3 - ether - -

bash-3.00# cat /etc/llthosts

1 dev

2 prod

bash-3.00# cat /etc/VRTSvcs/conf/config/main.cf

include "types.cf"

cluster rainbow (

)

system dev (

)

system prod (

)

bash-3.00# cat /etc/gabtab

/sbin/gabconfig -c -n2

bash-3.00# cat /etc/VRTSvcs/conf/sysname

prod

on prod node:

bash-3.00# cat /etc/llttab

set-node dev

set-cluster 101

link e1000g2 /dev/e1000g:2 - ether - -

link e1000g3 /dev/e1000g:3 - ether - -

bash-3.00# cat /etc/llthosts

1 dev

2 prod

bash-3.00# cat /etc/gabtab

/sbin/gabconfig -c -n2

bash-3.00# cat /etc/VRTSvcs/conf/config/main.cf

include "types.cf"

copied types.cf file from /etc/VRTSvcs/conf to /etc/VRTSvcs/conf/config

bash-3.00# cat /etc/VRTSvcs/conf/sysname

dev

after doing this tried to start llt and gab on both nodes using command

lltconfig -c

and

sh /etc/gabtab

but did not sucessful

so tried to start their SMF (i think in VCS6.0 they have removed /etc/rc2.d/S70llt and /etc/rc2.d/S92gab)

svcadm enable svc:/system/llt:default

svcadm enable svc:/system/gab:default

but still services was going in maintenance after analyzing the logs i found following error messages :

Feb 18 19:14:15 Executing start method ("/lib/svc/method/llt start") ]

This script is not allowed to start LLT. LLT_START is not 1

for this i changed value in following file:

bash-3.00# cat /etc/default/llt

# This file is sourced :

# from /etc/init.d/llt for Solaris < 2.10

# from /lib/svc/method/llt for Solaris 2.10

# Set the two environment variables below as follows:

# 1 = start or stop llt

# 0 = do not start or stop llt

LLT_START=1-----------> by default it was set to 0

LLT_STOP=1-----------> by default it was set to 0

same for gab

bash-3.00# cat /etc/default/gab

# This file is sourced :

# from /etc/init.d/gab for Solaris < 2.10

# from /lib/svc/method/gab for Solaris 2.10

# Set the two environment variables below as follows:

# 1 = start or stop gab

# 0 = do not start or stop gab

GAB_START=1-----------> by default it was set to 0

GAB_STOP=1-----------> by default it was set to 0

then my both services are up and running on both nodes :

bash-3.00# svcs -a|grep llt

online 9:07:25 svc:/system/llt:default

bash-3.00# svcs -a|grep gab

online 9:07:28 svc:/system/gab:default

then tried to bring VCS services online

but again it was going in to maintainace due to follwowing error :

Feb 18 23:05:26 dev Had[510]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10614 Cluster UUID is not configured or it is empty, on system dev - VCS Stopping. Manually Re

start VCS after configuring Cluster UUID.

to configure run following command on both nodes:

/opt/VRTSvcs/bin/uuidconfig.pl -clus -configure

also changed following values in /etc/default/vcs file:

VCS_START=1

VCS_STOP=1

now all my services are running but still i m getting following problem :

my gab is working properly on both nodes but llt is not communicating other node

output of gab is as following after starting cluster using hastart on both nodes:

bash-3.00# hastart

bash-3.00# gabconfig -a

GAB Port Memberships

===============================================================

Port a gen 12f3f02 membership ;1

Port h gen 12f3f09 membership ;1

bash-3.00# uname -n

dev

bash-3.00# hastart

bash-3.00# hastatus -sum

-- SYSTEM STATE

-- System State Frozen

A dev UNKNOWN 0

A prod RUNNING 0

bash-3.00# gabconfig -a

GAB Port Memberships

===============================================================

Port a gen 12f3f02 membership ; 2

Port h gen 12f3f0b membership ; 2

bash-3.00# uname -n

prod

but output of llt on dev node is :

Port h gen 12f3f09 membership ;1

bash-3.00# uname -n

dev

bash-3.00# lltstat -nl

LLT node information:

Node State Links

* 1 dev OPEN 2

LLT link information:

link 0 e1000g2 on etherfp hipri

mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6

txpkts 3514 txbytes 211939

rxpkts 937 rxbytes 68662

latehb 0 badcksum 0 errors 0

link 1 e1000g3 on etherfp hipri

mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6

txpkts 347 txbytes 24504

rxpkts 281 rxbytes 19328

latehb 0 badcksum 0 errors 0

and on prod :

bash-3.00# lltstat -nl

LLT node information:

Node State Links

* 2 prod OPEN 2

LLT link information:

link 0 e1000g2 on etherfp hipri

mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6

txpkts 3390 txbytes 180168

rxpkts 713 rxbytes 52320

latehb 0 badcksum 0 errors 0

link 1 e1000g3 on etherfp hipri

mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6

txpkts 444 txbytes 31270

rxpkts 257 rxbytes 15827

latehb 0 badcksum 0 errors 0

whereas it should see each other.

due to this may be i m getting output of hastatus -sum on prod:

bash-3.00# hastatus -sum

-- SYSTEM STATE

-- System State Frozen

A dev UNKNOWN 0

A prod RUNNING 0

bash-3.00# uname -n

prod

and on dev node output is :

bash-3.00# hastatus -sum

-- SYSTEM STATE

-- System State Frozen

A dev RUNNING 0

one more observation after starting cluster main.cf file on dev node is auto modified to

bash-3.00# cat /etc/VRTSvcs/conf/config/main.cf

include "types.cf"

cluster vcs (

)

system dev (

)

whereas only include "types.cf" line was present and we have added actual configuration on prod node.

in message file i can see following messages related to llt interfaces for other nodes:

Feb 21 09:07:17 dev e1000g: [ID 801725 kern.info] NOTICE: pci8086,100f - e1000g[3] : link up, 1000 Mbps, full duplex

Feb 21 09:07:17 dev e1000g: [ID 801725 kern.info] NOTICE: pci8086,100f - e1000g[2] : link up, 1000 Mbps, full duplex

Feb 21 09:07:27 dev genunix: [ID 644314 kern.notice] GAB INFO V-15-1-20026 Port a[GAB_Control (refcount 2)] registration waiting for seed port membership

Feb 21 09:07:41 dev syslog[542]: [ID 702911 daemon.notice] VCS INFO V-16-1-11240 Command Server: running with security OFF

Feb 21 09:07:42 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10619 'HAD' starting on: dev

Feb 21 09:07:42 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10620 Waiting for local cluster configuration status

Feb 21 09:07:42 dev genunix: [ID 122464 kern.notice] LLT INFO V-14-1-10499 recvarpreq link 1 for node 2 addr change from 00:00:00:00:00:00 to 00:0C:29:E2:CE:CB

Feb 21 09:07:42 dev genunix: [ID 122464 kern.notice] LLT INFO V-14-1-10499 recvarpreq link 0 for node 2 addr change from 00:00:00:00:00:00 to 00:0C:29:E2:CE:D5

Feb 21 09:07:42 dev genunix: [ID 860062 kern.notice] LLT INFO V-14-1-10024 link 0 (e1000g2) node 2 active

Feb 21 09:07:44 dev genunix: [ID 860062 kern.notice] LLT INFO V-14-1-10024 link 1 (e1000g3) node 2 active

Feb 21 09:07:49 dev syslog[542]: [ID 702911 daemon.warning] WARNING V-365-1-1 This host is not entitled to run Veritas Storage Foundation/Veritas Cluster Server.

Feb 21 09:07:49 dev As set forth in the End User License Agreement (EULA) you must complete one of the two options set forth below. To comply with this condition of the EULA and stop logging of this message, you have 56 days to either:

Feb 21 09:07:49 dev - make this host managed by a Management Server (see http://go.symantec.com/sfhakeyless for details and free download), or

Feb 21 09:07:49 dev - add a valid license key matching the functionality in use on this host using the command 'vxlicinst' and validate using the command 'vxkeyless set NONE'.

Feb 21 09:07:49 dev genunix: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port a[GAB_Control (refcount 1)] gen 12f3f01 membership ;12

Feb 21 09:08:04 dev genunix: [ID 773945 kern.info] UltraDMA mode 2 selected

Feb 21 09:08:04 dev genunix: [ID 935449 kern.info] ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property

Feb 21 09:08:04 dev genunix: [ID 882269 kern.info] PIO mode 4 selected

Feb 21 09:08:04 dev genunix: [ID 935449 kern.info] ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property

Feb 21 09:08:04 dev genunix: [ID 882269 kern.info] PIO mode 4 selected

Feb 21 09:08:04 dev genunix: [ID 935449 kern.info] ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property

Feb 21 09:08:04 dev genunix: [ID 882269 kern.info] PIO mode 4 selected

Feb 21 09:08:04 dev genunix: [ID 935449 kern.info] ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property

Feb 21 09:08:04 dev genunix: [ID 882269 kern.info] PIO mode 4 selected

Feb 21 09:08:13 dev svc.startd[7]: [ID 122153 daemon.warning] svc:/application/stosreg:default: Method or service exit timed out. Killing contract 95.

Feb 21 09:08:13 dev svc.startd[7]: [ID 636263 daemon.warning] svc:/application/stosreg:default: Method "/lib/svc/method/svc-stosreg" failed due to signal KILL.

Feb 21 09:08:14 dev sendmail[584]: [ID 702911 mail.crit] My unqualified host name (dev) unknown; sleeping for retry

Feb 21 09:08:17 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10625 Local cluster configuration valid

Feb 21 09:08:17 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-11034 Registering for cluster membership

Feb 21 09:08:17 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-11035 Waiting for cluster membership

Feb 21 09:08:22 dev genunix: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port h[GAB_USER_CLIENT (refcount 0)] gen 12f3f04 membership ;12

Feb 21 09:08:22 dev Had[497]: [ID 702911 daemon.notice] VCS INFO V-16-1-10077 Received new cluster membership

Feb 21 09:08:23 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10086 System dev (Node '1') is in Regular Membership - Membership: 0x6

Feb 21 09:08:23 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10086 System (Node '2') is in Regular Membership - Membership: 0x6

Feb 21 09:08:26 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10073 Building from local configuration

Feb 21 09:08:26 dev genunix: [ID 577146 kern.notice] NOTICE: VXFEN INFO V-11-1-VxFEN unloaded

Feb 21 09:08:27 dev genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 1 (e1000g3) node 2 in trouble

Feb 21 09:08:27 dev rootnex: [ID 349649 kern.info] xsvc0 at root

Feb 21 09:08:27 dev genunix: [ID 936769 kern.info] xsvc0 is /xsvc

Feb 21 09:08:31 dev pseudo: [ID 129642 kern.info] pseudo-device: devinfo0

Feb 21 09:08:31 dev genunix: [ID 936769 kern.info] devinfo0 is /pseudo/devinfo@0

Feb 21 09:08:31 dev unix: [ID 954099 kern.info] NOTICE: IRQ19 is being shared by drivers with different interrupt levels.

Feb 21 09:08:31 dev This may result in reduced system performance.

Feb 21 09:08:31 dev pci_pci: [ID 370704 kern.info] PCI-device: pci1274,1371@1, audioens0

Feb 21 09:08:31 dev genunix: [ID 936769 kern.info] audioens0 is /pci@0,0/pci15ad,790@11/pci1274,1371@1

Feb 21 09:08:33 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 8 sec (281)

Feb 21 09:08:34 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 9 sec (281)

Feb 21 09:08:35 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 10 sec (281)

Feb 21 09:08:36 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 11 sec (281)

Feb 21 09:08:37 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 12 sec (281)

Feb 21 09:08:38 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 13 sec (281)

Feb 21 09:08:39 dev genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 1 (e1000g3) node 2. 4 more to go.

Feb 21 09:08:39 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 14 sec (281)

Feb 21 09:08:39 dev genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 1 (e1000g3) node 2. 3 more to go.

Feb 21 09:08:40 dev genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 1 (e1000g3) node 2. 2 more to go.

Feb 21 09:08:40 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 1 (e1000g3) node 2 inactive 15 sec (281)

Feb 21 09:08:40 dev genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 1 (e1000g3) node 2. 1 more to go.

Feb 21 09:08:41 dev genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 1 (e1000g3) node 2. 0 more to go.

Feb 21 09:08:41 dev genunix: [ID 205468 kern.notice] LLT INFO V-14-1-10509 link 1 (e1000g3) node 2 expired

Feb 21 09:08:41 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-10066 Entering RUNNING state

Feb 21 09:08:47 dev genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 0 (e1000g2) node 2 in trouble

Feb 21 09:08:49 dev Had[497]: [ID 702911 daemon.notice] VCS NOTICE V-16-1-50311 VCS Engine: running with security OFF

Feb 21 09:08:54 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 8 sec (410)

Feb 21 09:08:55 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 10 sec (411)

Feb 21 09:08:56 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 11 sec (412)

Feb 21 09:08:51 dev Had[497]: [ID 702911 daemon.alert] VCS WARNING V-16-1-40184 HAD Self Check: Excessive delay in the HAD heartbeat to GAB

Feb 21 09:08:57 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 12 sec (412)

Feb 21 09:08:58 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 13 sec (413)

Feb 21 09:08:59 dev genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (e1000g2) node 2 inactive 14 sec (413)

same messages are observed on other node where as output of dladm show-dev is as follows:

bash-3.00# uname -n

prod

bash-3.00# dladm show-dev

e1000g0 link: up speed: 1000 Mbps duplex: full

e1000g1 link: up speed: 1000 Mbps duplex: full

e1000g2 link: up speed: 1000 Mbps duplex: full----link used for llt

e1000g3 link: up speed: 1000 Mbps duplex: full---link used for llt

bash-3.00# uname -n

dev

bash-3.00# dladm show-dev

e1000g0 link: up speed: 1000 Mbps duplex: full

e1000g1 link: up speed: 1000 Mbps duplex: full

e1000g2 link: up speed: 1000 Mbps duplex: full----link used for llt

e1000g3 link: up speed: 1000 Mbps duplex: full----link used for llt

if anybody knows solution to above problem pls guide me i think i m one step behind my cluster configuration . Thanks for your support.

Anish

Forum Discussion

Error During VCS configuration

Related Content

Error during upgrade - tsmmigrator.dll

Veritas Compliance Accelerator Configuration

Error occurred during initialization. Could not read logging configuration file.

AIR error 84 during Replication.

PostgreSQL issues during the configuration

Recent Discussions

Configure two Mount type resources of nfs FStype attribute using the same share

order

key registration and reservation

Verifying that primary and dr clusters replication is synced

vcs can create logical nic