Odd problem with Sun Cluster and Veritas Volume Manager
As folks who have been helping with my questions in the past week or two will know, I am using Veritas mirroring to move Oracle RAC databases off of old SAN storage onto new SAN storage. The systems are 2 node Sun Clusters with Oracle RAC mounting raw VxVM disk groups.
I have been using "vxmirror -g <diskgroup> -a <newmedianame> &" which has worked well on 2 of the clusters. No problems were encountered. Now, on the final cluster it causes a very strange problem. The /dev/vx/rdsk/ocsrawdg/* and /dev/vx/dsk/ocsrawdg/* volume files disappear from one node or both when I do the mirror command or even a vxplex command. Oracle stops working, but vxdg list and vxprint -vg ocsrawdg both show the diskgroup and volumes to be enabled and active. The Veritas side shows nothing wrong. I can give more strange and contradictory symptoms.
Going into scsetup and using the "Synchronize volume information for a VxVM device group" option which issues the following command, "scconf -c -D name=ocsrawdg,sync" fixes the problem. All the volume names then show up again in the 2 /dev/vx directories.
I have not seen this on the other clusters. The only real difference is the one I'm working on now is a larger disk group with 5 LUNs in the new SAN storage (all the others had single LUNs) and it has a larger number of volumes in the disk group, 254, as compared to about 64 in the next largest. I thought these mirror processes took place in background and did not interfere with active and enabled volumes/groups. That is certainly how it worked on the other clusters. Oh, and I tried issuing a mirror for just a single volume with vxassist, but that caused the same problem.
Does anyone have any idea why this is happening, and more importantly, can I do a background mirroring process that won't kill the application?
I wrote a short script to accomplish this task without taking down time.
for volname in $@; do
vxassist -g ocsrawdg -b mirror $volname ocsrawdisk1 ocsrawdisk2 ocsrawdisk3 ocsrawdisk4 ocsrawdisk5
scconf -c -D name=ocsrawdg,sync
vxtask -w 10 monitor
scconf -c -D name=ocsrawdg,sync
doneBasically, it just takes a list of all the volumes in the disk group and mirrors each one onto the new storage one at a time. As soon as the mirror process goes into background the scconf command syncs the disk group across the cluster. The real saving grace in this situation is the vxtask monitor function which will continue to run as long as the mirror is being built, then exits at the end, which gives me the perfect trigger to resync the disk group again, go back for the next volume in the diosk group, rinse and repeat until they are all done. It's time consuming, and the vxtask monitor is noisy, but the end users don't see any down time.
I will probably have to use a similar script when removing the plexes from the old storage, but that won't require waiting for the mirror to build. So, no need for the vxtask monitor.