cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

Oracle 10G Grid Agent interferes with Volume Manager disk group import

SST
Level 3
Running Veritas Cluster Server along with Veritas Volume Manager. Failovers of VCS Service Groups mean the disk groups are imported and deported quite often. We have run into a problem, which Veritas and Oracle agree exists, but apparently won't fix. If the 10G Grid agent runs on a server, usually at boot, when no disk groups are active on the server and you later try and import a disk group, it will fail because of some unique I/O performed by the Grid Agent against the Veritas devices (vxio).
 
At the bottom of this post is the info sent to me by Symantec regarding this problem. I was curious if others have run into this problem and if they had, did you just use one of the workarounds and are you pursuing to get a real solution with the vendors? I was hoping there might be enough others with this problem and we could bring more pressure to get the vendors to come up with a fix rather than fingerpointing.
 
 

>We have couple of incidents where Disk group imports may fail with

>"VxVM vxdg ERROR V-5-1-587 Disk group xxxxx: import failed: Internal

>configuration daemon error" on Solaris systems running the Oracle

>10G Grid agent. It has been found that when started (normally at

>boot time) the Oracle 10G Grid agent runs the following command:

>

>nmhs get_solaris_disks

>

>This binary traverses the /devices directory on the machine and

>opens any device files present, including Veritas devices such as

>vxio. The way in which vxio is opened causes it to initiate a

>transaction against a non existent rootvol which cannot be

>completed. Any subsequent import requests cause vxconfigd to

>recognise that there is an outstanding transaction in vxio causing

>the new import to fail. This continues until vxconfigd is restarted

>and reset as follows:

>

>vxconfigd -k -r reset

>

>Essentially the nmhs binary is attempting to open the vxio device in

>a non supported manner which does not follow the APIs we publish. As

>such we believe that it is Oracles responsibility to fix this issue

>by making changes to their binary to prevent this and would urge any

>affected customers to contact Oracle regarding a permanent fix. In

>the mean time the following workarounds are available to prevent the issue:

>

>1. Present extra LUNs to the system and use these to create a dummy

>disk group which will be auto imported at boot time. As long as

>Volume Manager has one disk group imported before the nmhs binary is

>run it is not affected by the nmhs binary.

>

>2. Remove/rename /etc/rc3.d/S99gcstartup which is the script

>responsible for starting the Oracle 10G Grid agent at boot time.

>Note that this will lead to a complete loss in agent functionality.

>

>3. Remove/rename the

>~/agent10g/sysman/admin/scripts/storage/sRawmetrics.pm Perl module

>in the agent installation. This will prevent the agent from calling

>the nmhs binary when starting. Note that this will lead to some loss

>in agent functionality.

>

>Can you run #vxconfigd -k -r reset

>

>and let us know if this resolves the diskgroup import issue.

0 REPLIES 0