cancel
Showing results for 
Search instead for 
Did you mean: 

Existing disk is showing error in vxvm

Hi Team,

 

One of the disk is showing error in vxdisk list command.The disk is allready member of one disk group and also the disk contain volumes.

How can i clear the error flag in the state column.

1 Solution

Accepted Solutions
Accepted Solution!

I believe you have IOFencing

I believe you have IOFencing issue here .... that is the reason you are unable to write, & also fencing keys are inconsistent ...

You can register the keys manually but if possible, have a clean reboot of system after clearing the keys manually ... or atleast restart complete VCS after clearing the keys manually

so here is what I would say:

1. clear the IOFencing keys using "vxfenclearpre" command:

https://sort.symantec.com/public/documents/sf/5.0/solaris/manpages/vcs/vxfenclearpre_1m.html

2. shutdown all the applications in cluster

3. shutdown VCS  (hastop -all)

4. Shutdown Fencing (/etc/init.d/vxfen stop) ... need to run this on both the nodes.

5. Ensure fencing module is unloaded ( modinfo | grep -i vxfen )

6. Ensure from gabconfig that only port a exists in all the nodes ( gabconfig -a )

7. Restart IOFencing on both the nodes (/etc/init.d/vxfen start)

8. Start cluster one by one on both the nodes (hastart)

Above procedure should help clear the keys & you should get the access back to the disk ...

Gaurav
 

View solution in original post

13 Replies

...you've logged this in the

...you've logged this in the wrong forum. What product is this and I will move it?

Hi   This is related to

Hi

 

This is related to veritas volume manager issue.

...should be in the correct

...should be in the correct forum now!

Please provide more info

What prlatform and what version of Storage Foundation are you using?
Also provide info about your OS, and output of following commands.

# vxdisk list
# vxprint -ht

need more details to answer

need more details to answer further..

1. how did issue happened ? anything you see in logs ?

2. Is it an internal disk or storage disk ?

3. OS version & vxvm version ?

4. paste following outputs:

# vxdisk list

# vxdisk -o alldgs -e list

# vxprint -qthg <diskgroup>

# modinfo | grep -i vx

 

Gaurav

Hi Kishore, you had a chance

Hi Kishore,

you had a chance to look in this ?

 

error state appears on a disk when volume manager is unable to read the label/disk geometry from the disk.. so the reason could be:

1. disk label has been lost

2. disk was replaced & new disk in place

If disk was in use & something happened to private region where vxvm is unable to read the disk private region, in this case, you should see disk going in "failed was" state in vxdisk list ..

In most of the case, you need to a bottoms up analysis & find what happened underneath to the disk .

 

Gaurav

Hi Gaurav,  We also

Hi Gaurav, 

We also experience the same problem. disk label has been corrupted. Before this it was runnig fine. 
Right now vxvm is having performance issue and very slow respond when it is imported to vxvm. The bigger the load the slower it become. Please help on how we can resolve the issue. Tried to relabel back. Its getting error. 


Disk not labeled.  Label it now? y
Warning: error setting drive geometry.
Warning: error writing VTOC.
Warning: no backup labels
Write label failed

Hi Raunaz Your symptoms above

Hi Raunaz

Your symptoms above indicate that the device may have issues with writes. Are there any entries in syslog from the disk driver for that disk (at the time of the re-label) ?

cheers

These are the errors.   Apr

These are the errors.

 

Apr  4 08:13:51 xxxxxx vxdmp: [ID 443116 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 i/o error occured (errno=0x0) on dmpnode 324/0x2ba
 
BTW, we are seeing reservation conflict error continously.

Hi Raunaz,   Are you talking

Hi Raunaz,

 

Are you talking about the same issue, we discussed about, in the following thread?:

https://www-secure.symantec.com/connect/forums/host-mode-option-xp24000-veritas-cluster-server

If yes, seems you'll need to erase the VTOC first, as it will not allow you to relabel unless you have good table of content.

 

BTW, we are seeing reservation conflict error continously.

>> Are you using fencing? If yes, could you check for any stale fence keys on the disk using vxfenadm (vxfenadm -s <disk_path>)

>> Do you also see any OS-SCSI errors?

>> Do these error (reservation conflict + vxdmp I/O error) occurs, when you try to label the disk

Hi Kishore, Please check the

Hi Kishore,

Please check the following :

1. # prtvtoc <error_disk_path> ; Run this on both /dev/rdsk and /dev/rdmp path of the error_disk

2. # dd if=<error_disk_path> of=/dev/null bs=1024 count=1024

     > While running the above command, do a tail -f /var/adm/messages from another term to see if any error occurs.

3. # vxdisk list <error_disk>, check if all the paths are enabled.

Hi Har-D,  I haven't tried to

Hi Har-D, 

I haven't tried to relabel by erasing the VTOC, we will only do that as the last option. 

Yes, we use scsi-3 fencing. I'm not sure how to check stale fence. 

but i found it looks OK to me 

 

Reading SCSI Registration Keys...
 
Device Name: /dev/rdsk/c2t50060E80056F1178d27s2
Total Number Of Keys: 2
key[0]:
        [Numeric Format]:  67,86,67,83,0,0,0,0
        [Character Format]: CVCS
        [Node Format]: Cluster ID: unknown  Node ID: 2   Node Name: esesbuh0003
key[1]:
        [Numeric Format]:  67,86,67,83,0,0,0,0
        [Character Format]: CVCS
        [Node Format]: Cluster ID: unknown  Node ID: 2   Node Name: esesbuh0003
 
and some have no keys.
Reading SCSI Registration Keys...
 
Device Name: /dev/rdsk/c2t50060E80056F1178d20s2
Total Number Of Keys: 0
No keys...
 
FYI, we have rebooted the machine few time. But the error still coming. 
 
 
 
Accepted Solution!

I believe you have IOFencing

I believe you have IOFencing issue here .... that is the reason you are unable to write, & also fencing keys are inconsistent ...

You can register the keys manually but if possible, have a clean reboot of system after clearing the keys manually ... or atleast restart complete VCS after clearing the keys manually

so here is what I would say:

1. clear the IOFencing keys using "vxfenclearpre" command:

https://sort.symantec.com/public/documents/sf/5.0/solaris/manpages/vcs/vxfenclearpre_1m.html

2. shutdown all the applications in cluster

3. shutdown VCS  (hastop -all)

4. Shutdown Fencing (/etc/init.d/vxfen stop) ... need to run this on both the nodes.

5. Ensure fencing module is unloaded ( modinfo | grep -i vxfen )

6. Ensure from gabconfig that only port a exists in all the nodes ( gabconfig -a )

7. Restart IOFencing on both the nodes (/etc/init.d/vxfen start)

8. Start cluster one by one on both the nodes (hastart)

Above procedure should help clear the keys & you should get the access back to the disk ...

Gaurav
 

View solution in original post