11-09-2010 08:01 AM
Hello everyone - I actually have two questions. The first centers around installing VCS in Solaris 10 u8 global zone - but only monitoring applications within a non-global zone. (ie: Apache). The zone-root is installed on local storage and the application is also installed on local storage. We only want to have the apache application shut down and started up on the other node. How would we do this? Every resource dependency example i see shows DG, Mount, etc.
The second question is can VCS be installed within a sparse non-global zone? And if so - how do we accomplish the same scenario?
thanks
Julie julie.lamothe@sita.aero
Solved! Go to Solution.
12-14-2010 07:24 AM
Hi,
Well thats good news, so far.
Does it failover if you kill them? Have you set the restart attribute to something other than 0?
Can you post your engineA and Apache logs?
11-09-2010 10:02 PM
Hi
Please check the attached diagrams attached, your configuration should fit one of the two. If not, please let me know how the file systems used by the zones is mounted, and where its located (on a diskgroup, or simply in the O/S disks). If its on the O/S and always available you should be able to simply remove the diskgroup and mount resources.
12-10-2010 09:13 AM
Riaan -
I actually have the non-global sparse zone configured with using zpools - so that takes the place of the mount and diskgroup. I have the application zfs filesystems set as datasets within the zone and the zfs mountpoint set to /www and /app within the zone.
We decided to make it shared storage and fail the zone (with Apache) from one node to the other. However, when i now try to bring up the Apache resource - it states it can not find the ConfigFile /zones/z2/root/www/current/conf/httpd.conf.
I have the ContainOpts attribute set and RunInContainer 1 PassCInfo 0.
Here is the zone config:
# zonecfg -z z2 info
zonename: z2
zonepath: /zones/z2
brand: native
autoboot: true
bootargs:
pool: UASLQYY001Q_zpool_vcs
limitpriv:
scheduling-class:
ip-type: shared
hostid:
[cpu-shares: 1000]
inherit-pkg-dir:
dir: /lib
inherit-pkg-dir:
dir: /platform
inherit-pkg-dir:
dir: /sbin
inherit-pkg-dir:
dir: /usr
fs:
dir: /export/home
special: /export/home/z2
raw not specified
type: lofs
options: [rw]
net:
address: 10.128.16.141/22
physical: bge0
defrouter not specified
dataset:
name: UASLQYY001Q_apppool_vcs/app
dataset:
name: UASLQYY001Q_apppool_vcs/www
rctl:
name: zone.cpu-shares
value: (priv=privileged,limit=1000,action=none)
And here is the Apache agent resource config:
#Resource Attribute System Value
sliq_apache Group global sliq_sg
sliq_apache Type global Apache
sliq_apache AutoStart global 1
sliq_apache Critical global 0
sliq_apache Enabled global 1
sliq_apache LastOnline global
sliq_apache MonitorOnly global 0
sliq_apache ResourceOwner global unknown
sliq_apache TriggerEvent global 0
sliq_apache ArgListValues UASLQYY001Q-GLZ ResLogLevel 1 INFO State 1 1 IState 1 0httpdDir 1 /zones/z2/root/www/current/conf/ SharedObjDir 1 "" EnvFile 1 "" PidFile 1 /zones/z2/root/www/logs/httpd.pid HostName 1 "" Port 1 80 User1 www SecondLevelMonitor 1 0 SecondLevelTimeout 1 30 ConfigFile 1 /zones/z2/root/www/current/conf/httpd.conf EnableSSL 1 0 DirectiveAfter 0 DirectiveBefore 0
sliq_apache ArgListValues UASLQYY002Q-GLZ ResLogLevel 1 INFO State 1 0 IState 1 0httpdDir 1 /zones/z2/root/www/current/conf/ SharedObjDir 1 "" EnvFile 1 "" PidFile 1 /zones/z2/root/www/logs/httpd.pid HostName 1 "" Port 1 80 User1 www SecondLevelMonitor 1 0 SecondLevelTimeout 1 30 ConfigFile 1 /zones/z2/root/www/current/conf/httpd.conf EnableSSL 1 0 DirectiveAfter 0 DirectiveBefore 0
sliq_apache ConfidenceLevel UASLQYY001Q-GLZ 0
sliq_apache ConfidenceLevel UASLQYY002Q-GLZ 0
sliq_apache ConfidenceMsg UASLQYY001Q-GLZ
sliq_apache ConfidenceMsg UASLQYY002Q-GLZ
sliq_apache Flags UASLQYY001Q-GLZ
sliq_apache Flags UASLQYY002Q-GLZ
sliq_apache IState UASLQYY001Q-GLZ not waiting
sliq_apache IState UASLQYY002Q-GLZ not waiting
sliq_apache MonitorMethod UASLQYY001Q-GLZ Traditional
sliq_apache MonitorMethod UASLQYY002Q-GLZ Traditional
sliq_apache Probed UASLQYY001Q-GLZ 1
sliq_apache Probed UASLQYY002Q-GLZ 1
sliq_apache Start UASLQYY001Q-GLZ 1
sliq_apache Start UASLQYY002Q-GLZ 0
sliq_apache State UASLQYY001Q-GLZ FAULTED
sliq_apache State UASLQYY002Q-GLZ OFFLINE
sliq_apache ComputeStats global 0
sliq_apache ConfigFile global /zones/z2/root/www/current/conf/httpd.conf
sliq_apache ContainerInfo global Type Zone Name z2 Enabled 1
sliq_apache ContainerOpts global RunInContainer 1 PassCInfo 0
sliq_apache DirectiveAfter global
sliq_apache DirectiveBefore global
sliq_apache EnableSSL global 0
sliq_apache EnvFile global
sliq_apache HostName global
sliq_apache PidFile global /zones/z2/root/www/logs/httpd.pid
sliq_apache Port global 80
sliq_apache ResLogLevel global INFO
sliq_apache ResourceInfo global State Stale Msg TS
sliq_apache SecondLevelMonitor global 0
sliq_apache SecondLevelTimeout global 30
sliq_apache SharedObjDir global
sliq_apache User global www
sliq_apache httpdDir global /zones/z2/root/www/current/conf/
sliq_apache MonitorTimeStats UASLQYY001Q-GLZ Avg 0 TS
sliq_apache MonitorTimeStats UASLQYY002Q-GLZ Avg 0 TS
Any help would be greatly appreciated.
thanks
JUlie
12-11-2010 08:37 PM
Hi Julie,
Ok so I'm not an expert on zones but you need to look at each component.
First thing I want to know each about the storage. You said you've decided to make it shared storage, and you're using ZFS. So my question is, how do you intend to failover/move the data from one node to the other. From what I know we can't do that with VCS. Usually a web application can be configured with local disk which the cluster would mount/unmount as required.
So assuming you're got some method to do this, the next thing I want to know is if the cluster can start your zone (resource), and whether all the mounts required to start Apache are present.
Can you also post your main.cf please.
12-13-2010 04:16 AM
Julie,
What version (and patch level) of VCS are you using? (eg: VCS 5.0 MP3? VCS 5.1?)
Although most dependency examples do tend to show DiskGroup and Mount dependencies rather than Zpool, provided your VCS version has the Zpool bundled agent, you should be able to configure dependencies using the same logic ie: it sounds like you need to configure a Zpool (and possibly also a Mount) resource, and configure the dependencies with the Apache resource accordingly.
eg: VCS 5.0 MP3 Bundled Agents Reference Guide (Solaris) -> Storage Agents -> Zpool Agent
http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/vcs_bundled_agents/ch_sol_storage_agents47.html
Note the following limitations for Zpool agent:
----------
Limitations
The agent does not support the use of logical volumes in ZFS. If ZFS logical volumes are in use in the pool, the pool cannot be exported, even with the -f option. Sun does not recommend the use of logical volumes in ZFS due to performance and reliability issues.
----------
Dependencies example:
http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/vcs_bundled_agents/ch_sol_storage_agents50.html
From same document: Mount agent notes -> ZFS file system and pool creation example
http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/vcs_bundled_agents/ch_sol_storage_agents45.html#1184901
Hope that helps,
Grace
12-14-2010 05:47 AM
Sheesh,
I seem to be out of date with the world. Can you let us know if you have the zpool agent available in your version. If you do, you'd need to add it as per the dependency diagram listed, and possibly add a mount resources as well. Depending on who you want to be responsible for mounting /app and /www (vcs or the zone).
12-14-2010 06:24 AM
Hi Riaan -
I do have the zpool configured and now all is working as far as failing over the zone, zpool and apache (within the zone).
Since i am using datasets for the seperate zpool/zfs filesystems configured within the zone - i had to actually point Apache to the filesystem as it resides within the zone (/www/current/etc) as opposed to how it is physically mounted in the global as /zones/z2/root/www/etc....
The only problem i seem to have now is that the Apache agent that comes bundled is not able to effectively monitor or restart the apache procs if they die (forcefully by me to test). It believes it is cleaning - but never actually restarts the procs.
Any ideas?
thanks
Julie
12-14-2010 07:24 AM
Hi,
Well thats good news, so far.
Does it failover if you kill them? Have you set the restart attribute to something other than 0?
Can you post your engineA and Apache logs?
12-14-2010 10:53 AM
it turned the debug level up to :
Apache LogDbg DBG_1 DBG_2 DBG_3 DBG_4 DBG_AGINFO DBG_AGDEBUG
Here is the Apache_A.log:
2010/12/14 18:39:12 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Adding timer for Agent with tmo 522366
VCSAgTimer.C:_add[724]
2010/12/14 18:39:12 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Timer id is 1
VCSAgTimer.C:_add[740]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) name(Agent) op(1606)
VCSAgTimer.C:check_timers[298]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Sending I am alive message
VCSAgTimer.C:check_timers[479]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Sending IAA at 522366 seconds
VCSAgNotifier.C:send_iamalive[910]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Resetting periodic timer for resource Agent op 160
6 to expire at 522409
VCSAgTimer.C:_reset_periodic_timer[1000]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Adding timer for Agent with tmo 522409
VCSAgTimer.C:_add[724]
2010/12/14 18:39:55 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Timer id is 1
VCSAgTimer.C:_add[740]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) name(sliq_apache) op(1619)
VCSAgTimer.C:check_timers[298]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Resetting periodic timer for resource sliq_apache
op 1619 to expire at 522677
VCSAgTimer.C:_reset_periodic_timer[1000]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Adding timer for sliq_apache with tmo 522677
VCSAgTimer.C:_add[724]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Timer id is 16
VCSAgTimer.C:_add[740]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Appending command minor code 1607 for resource sli
q_apache
VCSAgRes.C:append_cmd[466]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(2) Scheduled resource sliq_apache
VCSAgSched.C:put_req[173]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Picked Res(sliq_apache) from Scheduler
VCSAgSched.C:_dequeue[64]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Resource (sliq_apache) received cmd minor code (MS
G_AGI_MONITOR_TIMER)
VCSAgRes.C:process_cmd[5594]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Res(sliq_apache) - incremented _monitor_count to (
1)
VCSAgRes.C:inc_and_compare_monitor_count[10404]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Res(sliq_apache) - _leveltwo_monitor_freq : 1
VCSAgRes.C:inc_and_compare_monitor_count[10436]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Res(sliq_apache) - _monitor_levels : 3
VCSAgRes.C:inc_and_compare_monitor_count[10449]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) _monitor_count reached _monitor_count_overflow; re
-wound _monitor_count to 0
VCSAgRes.C:inc_and_compare_monitor_count[10456]
2010/12/14 18:40:06 VCS DBG_AGDEBUG V-16-50-0 Thread(4) Res(sliq_apache) - _monitor_count is (0), _monitor
_count_overflow is (1)
VCSAgRes.C:inc_and_compare_monitor_count[10460]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) Resource sliq_apache transitioning from Offline to
Monitoring
VCSAgRes.C:internal_state[4901]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) The values of Container Object attributes are given
below
VCSAgRes.C:get_container_object[2308]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) The RIC for the Resource is 1 and PassCInfo is 0
VCSAgRes.C:get_container_object[2361]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) Container created with Name z2, Type Zone and Enabl
ed 1
VCSAgRes.C:get_container_object[2376]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) Creating object of Zone container
VCSAgContainer.C:create_container[222]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) The values of ArgList attributes are given below
VCSAgRes.C:call_entry_point[1129]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[0] is (ResLogLevel)
VCSAgRes.C:call_entry_point[1140]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[1] is (1)
VCSAgRes.C:call_entry_point[1140]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[2] is (TRACE)
VCSAgRes.C:call_entry_point[1140]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[3] is (State)
VCSAgRes.C:call_entry_point[1140]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[4] is (1)
VCSAgRes.C:call_entry_point[1140]
2010/12/14 18:40:06 VCS DBG_AGINFO V-16-50-0 Thread(4) arg[5] is (1)
VCSAgRes.C:call_entry_point[1140]
Here is output from engine_A.log:
2010/12/14 18:29:54 VCS ERROR V-16-2-13067 (UASLQYY001Q-GLZ) Agent is calling clean for resource(sliq_apac
he) because the resource became OFFLINE unexpectedly, on its own.
2010/12/14 18:29:54 VCS NOTICE V-16-10061-20143 (UASLQYY001Q-GLZ) Apache:sliq_apache:clean:VCSagentFW:Setu
pLogging:[clean] Entered by resource instance [sliq_apache] with clean reason [4][Unexpected Offline]
2010/12/14 18:30:05 VCS INFO V-16-2-13068 (UASLQYY001Q-GLZ) Resource(sliq_apache) - clean completed succes
sfully.
2010/12/14 18:30:06 VCS INFO V-16-1-10307 Resource sliq_apache (Owner: unknown, Group: sliq_sg) is offline
on UASLQYY001Q-GLZ (Not initiated by VCS)
I have just set the RestartLimit to 2 and the RestartLimit to 2
12-14-2010 10:58 AM
That was it!!!!! Changed the RestartLimit to 2; restarted the apache process - then killed the two httpd procs from the global and it restarted!
thanks for all your help!
12-15-2010 07:48 AM
Glad its working :)