Re: Creating Apache service group for VCS 5.0

Bashir_Gas · ‎02-15-2007

Hi,

First of all we thank Gene Henriksson for helping us to brake the first barrier.
Now we have VCS 5.0 running on 2 sun systems. We have not had any course about this
and our project is about implementing application and database HA.

I check all the documentation and see no Apache agent 5.0

Could some one give us a hint(may be a nice tutorial) of how to do this?
I have prepared a virtual address for that servicegroup.

Thanks
Bashir

Gene_Henriksen · ‎02-15-2007

Did you look at the Bundled Agents Reference Guide (BARG)? It is definitely in there in the version 5.0 we use in the classrooms. Also, you can run "haagent -list" to see all the agents.

Page 94 of the BARG shows the Apache agent. Page 100 shows a sample configuration with a DiskGroup, Mount, IP and Apache as you would see in the main.cf. The BARG is your best resource on all the resource types. You can do it with an NFS mount.

look in /opt/VRTS/docs for vcs_bundled_agents.pdf. The docs should all be there if you chose to install all the optional packages. The docs are in the VRTSvcsdc package in case you didn't install it.

Where are you taking this course?

Bashir_Gas · ‎02-15-2007

We would be happy if we had a course about this.

We are graduating students (BS) who must do a project with master students.
The responsible people for many projected presented this to us, so we jumped
into this without any previous knowledge or experience. I guess you could figure out
already because of my beginner questions!!!!

So, I will install Apache on both systems and put the "htdocs" on a shared disk (in my case nfs mount)
Is that correct? Or do you have a better way? May be to put even configuration file in the same place as "htdocs"???

Would I ignore the Diskgroup block in the sample configuration and adjust the Apache_mnt block for nfs?

How do I know that Apache will be started when a failure occurs in a cluster? I can not see anything specifying that in the sample configuration!! I am using Solaris 10 which means dealing with FMRI commands in order to start/stop daemons.

Thanks
Bashir

Gene_Henriksen · ‎02-15-2007

You said you were students, so I assumed you were in a course.

You jumped in without any knowledge or experience. That sounds like me. That is a very good way to learn, it is painful, but you remember painful lessons.

When you set up a failover service group and make resources Critical, then you have told VCS that if a critical resource fails, it should attempt to bring up the service group on another system in the SystemList. Nothing in the attributes specifies failover, it simply gives VCS the information necessary to online, monitor and offline the application.

IF you are using static scripts for the web server, not data that will be changing constantly, you can put the scripts on each server, perhaps with the binaries. This is what VCS calls a "shared nothing" service group. When you failiover, the service group would only have the IP, perhaps a mount, and the Apache resource.

As far as it being Solaris 10, it doesn't matter. VCS 5.0 was written for Sol 10. Look in /opt/VRTSvcs/bin/Apache. There you will see the online script to start the Apache instance. DO NOT MODIFY THESE SCRIPTS. You can read thru them, they are in Perl, to see how it works.

Bashir_Gas · ‎02-15-2007

You have answers to all my questions so I dare to keep asking you more.

In the sample configuration you see "Mountpoint -> /apache" and as you already know
the apache installation files in solaris 10 are spread over different directories on different slices.

About Apache
You have the binaries(/usr/apache2/bin), Configuration files(/etc/apache2), html documents(/var/apache2/htdocs), etc on both sysA and sysB.

I setup a nfs share(on sysA) and mounted on both systems as /apache. My plan for this was/is about to present same html document/s regardless of what system is communicating with clients.

<-- pasted in from vcs_bundled_agents -->
Verify that the Apache server configuration files are identical on all cluster systems
<-- end -->
How can I accomplish this with the existing installation and at the same time reflect to the parameters value in the main.cf for ApacheG1 here below:

group ApacheG1(
SystemList = { sysA = 0, sysB = 1 }
)

Apache httpd_server (
Critical = 0 httpdDir = "/apache/bin"
HostName = www <-- Virtual hostname or Apache??
Port = 80
User = nobody
SecondLevelMonitor = 1
ConfigFile = "/apache/conf/httpd.conf"
)

// I'll remove this block
DiskGroup Apache_dg (
Critical = 0
DiskGroup = apc1
)

IP Apache_ip (
Critical = 0
Device = dmfe0:2 <-- I created this with ifconfig only on sysA
Address = "172.27.238.111"
NetMask = "255.255.255.128"
)

Mount Apache_mnt (
Critical = 0
MountPoint = "/apache"
BlockDevice = "" <-- what should I put here?
FSType = nfs
FsckOpt = "-y"
)

Apache_mnt requires Apache_dg <-- Replacing this?
httpd_server requires Apache_mnt
httpd_server requires Apache_ip

I WOULD RATHER SEE YOUR SOLUTION. :(

Please explain to me like I was 2 years old, because if I get this right then it will be easy for me
setting up other resources like Oracle in a simulated shared disk via nfs.

Thanks
Bash

Gene_Henriksen · ‎02-15-2007

!) Don't cross mount with NFS. If the server that is sharing the NFS mount (SysA) dies then SysB loses the ability to work. You should use the local directories and they should be identical on both of the systems.

2) You shouldn't bring up the virtual IP. VCS does that. Also the device is dmfe0 not dmfe0:2, dmfe0:2 is an alias that will change whenever there are more or less IPs on the NIC. VCS uses the ifconfig addif function to assign the next available alias. It will not work with dmfe0:2 you will get an unknown status on the device.

3) the BARG shows the httpd directory as follows:
Full path of the directory to the httpd binary file Type and dimension: string-scalar Example: "/apache/server1/bin" so I assume you would use /usr/apache2/bin

4) unless you want to give Apache the user privileges of "nobody", leave it blank, it defaults to root. Does nobody have the right to run Apache?

5) You should be using local files so you don't need a mount or DG.

6) the Hostname is the virtual name that apache will use for the service. This name will need to be defined in the DNS server for anyone to find it. Talk to the network admins.

The only problem with local files is that someone will have to update both systems when they change.

Bashir_Gas · ‎02-18-2007

I can now run Apache as a failover service group.
I killed the httpd daemon on sysA manually, then VCS initiates the fault and starts Apache on sysB.
If I also kill the httpd on sysB then both systems are faulted????

How can I make sure that eventhough the previous system was faulted, VCS still should clear the fault ed system and starts once again on any system in that group? or do I have to change from failover to parallell??

Thanks for your help

// Bash

Hywel_Mallett · ‎02-19-2007

VCS's default operation is that faults must be cleared manually.
The problem with what you are proposing is that if a fault is automatically cleared in VCS, who's to say that the fault has actually disappeared?
If faults were cleared in VCS automatically, but they were actually still present, you could end up with a service group faulting continually between two systems.

Changing from a failover service group to parallel is a different matter. Parallel service groups allow the service to be online on multiple nodes at the same time.

Gene_Henriksen · ‎02-19-2007

Hywell is correct. VCS is designed so that as long as a critical resource is faulted on a system, VCS will not restart on that system until such time as the admin clears the fault. This is to prevent failing back and forth continually.

With static information for the web server to hand out, as long as you are keeping all the data local, you could use a parallel service group. This brings up another problem with IP addresses, you cannot bring up the same IP twice. If you brought up different IPs on each system, then you need some system for directing web requests to alternate servers ....

Another way to look at this is to set the RestartLimit for the Apache resource type to more than 0. If you set it to 1, then if Apache fails, the agent will restart it and log the event in the engine log. If the resource now goes the number of seconds specified by ConfInterval (600 seconds by default), the internal counter is reset to 0 and if it goes offline again will be restarted on the same system. This could go on for weeks without a failover.

Gene_Henriksen · ‎02-19-2007

Bash, you never mentioned a NIC resource, but you should have one and you should link the IP to it (IP is dependent on the NIC). If the NIC resource were to fail, it would cause failover. If you do not have a NIC resource and the NIC fails, the IP will stay up.

NIC will broadcast a ping and watch the packet count before and after ping. If packet count goes up, then NIC is working

IP will check output of ifconfig, if IP address is there, IP is working.

NIC is a persistent resource (no online or offline) if NIC fails (you pull cable from server) and then you put the cable back in place, the next time the NIC agent monitors the NIC and finds it functional, it will clear the fault. This is only true with persistent resources.

Gene_Henriksen · ‎02-22-2007

Bash,
Did you get it working?

Bashir_Gas · ‎02-25-2007

I was sick and unable to post a reply to your input.

You are right about clearing faults.
I have to clear faults with "hagrp" command or do it from management console.

Yes, the failover works.

I find this case closed.

Thanks for your help

Bashir

VOX

Creating Apache service group for VCS 5.0