cancel
Showing results for 
Search instead for 
Did you mean: 

startup/shutdown of Oracle inside non-global zone, Solaris 10, VCS 5.1

Rowley
Level 3

Hi there,

I'm looking for advice regarding configuring automated startup/shutdown of an Oracle 11.2 database within a zone on Solaris. The file systems for the oracle db are VxVM, dependencies are in place so that the DG and all volumes are mounted prior to starting the zone which is configured to loopback mount the volumes in the global.

Question is - how do I gracefully start/stop the db when on/offlining the zone resource? I'm guessing some post-online and pre-offline scripts but this doesn't seem like it could be the most elegant solution. If there is no other way, if anyone has any links to examples, info, documented efforts etc, I'd appreciate it.

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

mikebounds
Level 6
Partner Accredited

You can prevent the guest seeing the global zone if the guest is unable to resolve the host name of the global zone - for most agents this won't cause the agent to fail - the engine log will just complain that the local zone cannot log back to the global zone.  Note, if you remove .vcspwd file then this will get recreated next time the Zone resource onlines.  Note also the user created so that the local zone can communicate to the global zone is an administrator of ONLY the service groups containing the zone resource, so you can not shutdown the cluster or change any other SGs, but you could bring down the Service group containing the zone, but if you are root in the local zone then you could do this without VCS.

You could prevent root login to the local zone, then this prevents the local zone seeing global zone, unless you zlogin in from global zone (which means you had access to VCS from the global zone)

Mike

View solution in original post

13 REPLIES 13

mikebounds
Level 6
Partner Accredited

The Oracle agent is zone aware so the VCS Oracle agent can start and stop (and monitor) Oracle.  If you need an example of how to set this up, let me know the VCS version you are using.

Mike

g_lee
Level 6

As Mike pointed out, the Oracle agent is zone aware, so you can create an Oracle resource in the service group to monitor/start/stop.

From VCS 5.1 Agent for Oracle Installation and Configuration Guide ( https://sort.symantec.com/public/documents/sf/5.1/solaris/pdf/vcs_oracle_agent.pdf ) -> How the agent monitors Oracle instances running in Solaris zones (p14)
------------------------------
Solaris 10 provides a means of virtualizing operating system services, allowing one or more processes to run in isolation from other activity on the system. Such a "sandbox" is called a "non-global zone." Each zone can provide a rich and customized set of services. The processes that run in a “global zone” have the same set of privileges that are available on a Solaris system today.
VCS provides high availability to applications running in non-global zones by extending the failover capability to zones. VCS is installed in a global zone, and all the VCS agents and the engine components run in the global zone. For applications running within non-global zones, agents run script entry points inside the zones. If a zone configured under VCS control faults, VCS fails over the entire service group containing the zone.
See Veritas Cluster Server Administrator’s Guide.
The Veritas Cluster Server agent for Oracle is zone-aware and can monitor Oracle instances running in non-global zones.
------------------------------

HTML version: https://sort.symantec.com/public/documents/sf/5.1/solaris/html/vcs_oracle_agent/index.html

The guide also has configuration examples / how to configure the instance(s).

Rowley
Level 3

Thanks for the replies.

Lee - I read that but then it just leaves you with the catch-all of "see the admin guide". Then, asides from content regarding application and zone type resources, it doesn't really elaborate. Looking at the bundled agents resource guide on p118, there is an example of how to configure an application resource within a zone. The config example given is:

 Application samba_app (
StartProgram = "/usr/sbin/samba start"
StopProgram = "/usr/sbin/samba stop"
PidFiles = { "/localzone1/root/var/lock/samba/smbd.pid" }
MonitorProcesses = { "nmbd" }
ContainerName = "localzone1"
)

The restriction I have is that only the Global is VCS aware. There are restrictions that prevent us from having any VCS commands available to the host. So, does the ContainerName = "yada" then allow VCS to reach into the zone without having any VCS commands installed in the zone?

Mike - i'd be interested to see what the config would be to set up oracle within a zone that was cluster-aware, but I think given the restriction I still need a post-online/pre-offline scripts called from the global with zlogin -z zone -c '/path/to/script' commands in?

mikebounds
Level 6
Partner Accredited

For 5.0, an Oracle resourse would look like:

 

Oracle ora_prod (
Sid = prod
Owner = oracle
Home = "/oracle/product/10.2.0"
ContainerName @sys1 = zone1
ContainerName @sys2 = zone2
)
 
For 5.1, an Oracle resource would look no different from running in global zone, i.e it would have no ContainerName attribute as zone info is specified in the service group for 5.1 - like:
 
group ora_prod_sg (
SystemList = { sys1 = 0, sys2 = 1 }
ContainerInfo @sys1 = { Name = zone1, Type = Zone, Enabled = 1 }
ContainerInfo @sys2 = { Name = zone2, Type = Zone, Enabled = 1 }
AutoStartList = { sys1, sys2 }
)
 
Regardless of version the Oracle agent works by using zlogin to start and stop Oracle.
This does require Oracle agent to be installed in the local zone, but this is done automatially when you install the Oracle agent in the global zone. Full  VCS is not installed in the localzone, however some components will already be installed automatically like the "halogin" command.   As far as I am aware the only VCS commands run from the local zone are for the local zone to be able to write messages to the engine log in the global zone and if the communication does not work then the Oracle will still work (it will just be missing some information in the engine_log in the global zone, but you can look at error in the Oracle agent log in the local zone).  The way the local zone writes messages to the engine log in the global zone is by using halogin with a VCS user so you don't need rsh or ssh set-up - all that is required is that the local zone can resolve the hostname of the global zone.
 
So no need to use trigger scripts as what you are intending to run in these scripts is what the Oracle Agent does.
 
Mike

g_lee
Level 6

Rowley,

Mike is correct / has provided you with the solution.

In 5.1, "The ContainerName and ContainerType attributes are deprecated." (ref: VCS 5.1 Bundled Agents Guide p176 https://sort.symantec.com/public/documents/sf/5.1/solaris/pdf/vcs_bundled_agents.pdf )

I can't find the exact document you're referenced with that example on p118, however it appears to be pre-5.1 (eg: p169 of the 5.0MP3 bundled agents guide has a config example that matches yours) - if you're using 5.1 you should refer to the documents for that version so you're not trying to use changed/obsolete information. See SORT to select the documentation relevant to your version: https://sort.symantec.com/documents

The equivalent example from the 5.1 Bundled Agents guide (p200) shows:
------------------------------
Application resource in a non-global zone for Solaris 10
In this example, configure a resource in a non-global zone: localzone1. The ZonePath of localzone1 is /zone1/root. The ConainerInfo attribute for this service group is set to ContainerInfo @ sys1 = { Name = “localzone1”, Type = “Zone”, Enabled = 1}. Configure the executable samba as StartProgram and StopProgram, with start and stop specified as command line arguments respectively. Configure the agent to monitor two processes: a process specified by the pid smbd.pid, and the process nmbd.
Application samba_app (
    StartProgram = "/usr/sbin/samba start"
    StopProgram = "/usr/sbin/samba stop"
    PidFiles = { "/localzone1/root/var/lock/samba/smbd.pid" }
    MonitorProcesses = { "nmbd" }
)
------------------------------

The VCS 5.1 Agent for Oracle Installation and Configuration Guide has Sample configuration for Oracle instances in Solaris zones (pp138-146) - these are dependency charts rather than main.cf extracts; there are main.cf examples pp120-137, however as you mentioned there isn't a main.cf example showing a non-global zone configuration. As Mike points out this is not required since there is no difference in the configuration of the Oracle resource configuration (ie: the zone config is specified in the sg configuration)

From Veritas Storage Foundation and High Availability Solutions Virtualization Guide - About VCS support for zones p18 (https://sort.symantec.com/public/documents/sf/5.1/solaris/pdf/sfha_virtualization.pdf )
------------------------------
Configuring the ContainerInfo attribute
The service group attribute ContainerInfo specifies information about the zone. When you have configured and enabled the ContainerInfo attribute, you have enabled the zone-aware resources in that service group to work in the zone environment.
VCS defines the zone information at the level of the service group so that you do not have to define it for each resource. You need to specify a per-system value for the ContainerInfo attribute.
------------------------------

The resource type/attribute definitions are in Appendix A of the Oracle Agent installation + configuration guide, so once the service group has been configured to specify the zone, you should hzve enough information to define the Oracle resource to start/stop/monitor in the zone.

regards,

Grace

Rowley
Level 3

Hi Mike. Thanks for sticking with me so far.

OK, so i've added the container attibutes to the Oracle and Netlsnr resource types.

With the DB started, the global can do a Traditional check to see whether the db and netlsnrs are up, however, if I try and stop the database, it reaches timeout then terminates it. If I try and start the db, I just see the "VCS ERROR V-16-1-10600 Cannot connect to VCS engine" error.

It's as if the commands issued from the global are trying to reach into another running HAD process that just isn't there to control these resources. The config is as follows:

group SG-testdb-db (
        SystemList = { globalhost = 0 }
        ContainerInfo @globalhost = { Name = legolas, Type = Zone, Enabled = 1 }
        AutoStartList = { globalhost }
        Administrators = { z_legolas-zn_globalhost }
        Operators = { oracle }
        )

        Netlsnr testdb-db-lsnr (
                Critical = 0
                Home = "/u01/app/oracle/product/10204"
                Listener = listener_testdb
                TnsAdmin = "/u01/app/oracle/product/10204/network/admin/"
                Owner = oracle
                EnvFile = "/u01/testdb/oraadmin/etc/testdb.env"
                ContainerName = legolas
                )

        Oracle testdb-ora (
                Critical = 0
                Pfile = "/u01/app/oracle/product/10204/dbs/inittestdb.ora"
                Owner = oracle
                EnvFile = "/u01/testdb/oraadmin/etc/testdb.env"
                Home = "/u01/app/oracle/product/10204"
                StartUpOpt = CUSTOM
                Sid = testdb
                ContainerName = legolas
                )

        Zone legolas-zn (
                Critical = 0
                )


 

 

Am I missing something?

g_lee
Level 6

Rowley,

Are you using 5.0 or 5.1?

The ContainerName attribute was only for 5.0 (as Mike mentioned above). If you're using 5.1 the Oracle agent uses the zone specified in the group attribute ContainerInfo (ie: you should not have ContainerName resource attributes in 5.1)

regards,
Grace

mikebounds
Level 6
Partner Accredited

 

As your SG has ContainerInfo attribute, then you must have 5.1, but you are using a VCS 5.0 Oracle agent.  You need to use the VCS 5.1 Oracle agent.  Agent versions can be confusing as I have a feeling that 5.1 Oracle agent (5.1 refering to Agent version not VCS version) came out before VCS 5.1, so you need to download VCS 5.1 Oracle agent from https://sort.symantec.com/agents.
 
Sounds like you have a second problem as well as you have "Cannot connect to VCS engine".  zlogin into local zone which should take you to roots home directory and run "ls -a".  You should have a file called ".vcspwd", but I suspect this is not there - this been the case, check you can ping the global VCS node by name and if you can't then add IP to /etc/hosts.
Then shutdown your cluster, remove old Oracle agent and install new.
Then copy OracleTypes.cf from /etc/VRTSagents/ha/conf/Oracle (which should NOT contain a ContainerName attibute) to /etc/VRTSvcs/conf/config (if you have tuned any Oracle attributes, then make a note of these first as you will need to merge these into the new file that you are overwriting the old file - for example - if you have changed Netlsnr RestartLimit to 1, then you need to copy this line from old types file into the new one).
Next remove ContainerName attribute from main.cf for Oracle and Netlsnr resources.
Then start VCS.  After Zone has started in VCS, you should see .vcspwd file at this point
 
Oracle resources should now online and offline ok.
 
Mike
 

Rowley
Level 3

Guys and gals, you are correct that it is 5.1 although I'm pretty sure there was a hint somewhere the subject line. I'll look into the agent versions first thing in the morning. I was looking for .vcspwd in the zone (yes there was a login issue) but for the oracle user. Will check the root user as soon as.

Nearly 2 years i've been using this product but boy, is there more to it than meets the eye. Your help is invaluable, I thank you again.

Rowley
Level 3

Sorted.

All that was required was ContainerInfo @sunbed01 = { Name = legolas, Type = Zone, Enabled = 1 }
in the service group definition as everything was on 5.1.

Thanks for all your help, greatly appreciated.

Rowley
Level 3

OK, I still have a problem. I still have the requirement that the guest zone shouldn't be able to see anything in the global but with the current setup I can view service groups, resources, cluster info and status.

This cannot happen.

Is it possible to prevent the guest from seeing anything in the global?

mikebounds
Level 6
Partner Accredited

You can prevent the guest seeing the global zone if the guest is unable to resolve the host name of the global zone - for most agents this won't cause the agent to fail - the engine log will just complain that the local zone cannot log back to the global zone.  Note, if you remove .vcspwd file then this will get recreated next time the Zone resource onlines.  Note also the user created so that the local zone can communicate to the global zone is an administrator of ONLY the service groups containing the zone resource, so you can not shutdown the cluster or change any other SGs, but you could bring down the Service group containing the zone, but if you are root in the local zone then you could do this without VCS.

You could prevent root login to the local zone, then this prevents the local zone seeing global zone, unless you zlogin in from global zone (which means you had access to VCS from the global zone)

Mike

Rowley
Level 3

So, i've changed the ip-type of the guest zone to be exclusive on a completely different network with no route between the two and enabled the ip_restrict_interzone_loopback with ndd. I've removed the globals host entry from the guest and I cannot see any details about the cluster running in the global.

I think, for all intents and purposes, the guest zone is independent as much as can be from the global, yet I still appear to be able to control application resources within the guest zone from cluster in the global.