Solved: All outputs looks OK , ASL is

enzo68 · ‎11-26-2013

Hi to all

I have redhat linux 5.9 64bit with SFHA 5.1 SP1 RP4 with fencing enable ( our storage device is IBM .Storwize V3700 SFF scsi3 compliant

[root@mitoora1 ~]# vxfenadm -d

I/O Fencing Cluster Information:
================================

Fencing Protocol Version: 201
Fencing Mode: SCSI3
Fencing SCSI3 Disk Policy: dmp
Cluster Members:

* 0 (mitoora1)
1 (mitoora2)

RFSM State Information:
node 0 in state 8 (running)
node 1 in state 8 (running)

********************************************

in /etc/vxfenmode (scsi3_disk_policy=dmp and vxfen_mode=scsi3)

vxdctl scsi3pr
scsi3pr: on

[root@mitoora1 etc]# more /etc/vxfentab
#
# /etc/vxfentab:
# DO NOT MODIFY this file as it is generated by the
# VXFEN rc script from the file /etc/vxfendg.
#
/dev/vx/rdmp/storwizev70000_000007
/dev/vx/rdmp/storwizev70000_000008
/dev/vx/rdmp/storwizev70000_000009

******************************************

[root@mitoora1 etc]# vxdmpadm listctlr all
CTLR-NAME       ENCLR-TYPE      STATE      ENCLR-NAME
=====================================================
c0              Disk            ENABLED      disk
c10             StorwizeV7000   ENABLED      storwizev70000
c7              StorwizeV7000   ENABLED      storwizev70000
c8              StorwizeV7000   ENABLED      storwizev70000
c9              StorwizeV7000   ENABLED      storwizev70000

main.cf

cluster drdbonesales (
        UserNames = { admin = hlmElgLimHmmKumGlj }
        ClusterAddress = "10.90.15.30"
        Administrators = { admin }
        UseFence = SCSI3
        )

**********************************************

I configured the coordinator fencing so I have 3 lun in a veritas disk group ( dmp coordinator )
All seems works fine but I noticed a lot of reservation conflict in the messages
of both nodes

On the log of the server I am constantly these messages: /var/log/messages

Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 10:0:0:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 9:0:1:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 9:0:0:1: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:3: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:3: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:3: reservation conflict
Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:3: reservation conflict

You have any idea?

Best Regards

Vincenzo

Gaurav_S · ‎12-06-2013

Hi

As I mentioned in first post on this thread ,I was in the same opinion that these messages are ignorable (if there are no operational issues) .. was expecting that support would say the same .. however good to have confirmation that its identified bug & would be fixed...

thx for the update ..

G

View solution in original post

stinsong · ‎11-26-2013

Hi Vincenzo,

Typically "reservation conflict" is SCSI3-PGR key write failure which could be caused by existing key or HW write failure.

Pls check the disk access capability by "dd" likely method and if there is existing key on disks by "vxfenclearpre"

enzo68 · ‎11-27-2013

Hi,

the key seems to be correct; in attach the output of the commands "vxfenadm -s all -f /etc/vxfentab",

"lltstat -C" (cluster id)

and /opt/VRTSvcs/vxfen/bin/vxfentsthdw -c dgFence (Passed)

For the moment I only created the disk group for the fencing.

In the Doc "Veritas Storage Foundation and High Availability Solutions Release Notes

5.1 Service Pack 1 Rolling Patch 4 for Linux page 146"

"SCSI reservation errors during bootup
If you reboot a node of an SF Oracle RAC cluster, SCSI reservation errors may be
observed during bootup. [255515]
For example:
Nov 23 13:18:28 galaxy kernel: scsi3 (0,0,6) : RESERVATION CONFLICT
About Veritas Storage Foundation and High Availability Solutions 146
Known Issues
This message is printed for each disk that is a member of any shared disk group
which is protected by SCSI-3 I/O fencing. The message may be safely ignored."

but I only created the fencing disk group and I have not installed Oracle RAC!!!

The problem exists also both during boot server and when package cluster switching.

By log, the problem seems to recur every 20 minutes in /var/log/messages

Best regards

Vincenzo

Gaurav_S · ‎11-27-2013

Hi Vincenzo,

are there any operational effects because of these errors ? Are there any other SCSI or DMP related messages you saw around reservation conflicts ?

I saw a tech article

http://www.symantec.com/docs/TECH192940

above states that 6.0.1 should have a fix however I have checked the release notes of 6.0.1 & for 6.1 however both doesn't have anything specific to reservation conflcts. There is an issue mentioned in known issues section which appears if powerpath is in use as well (I assume you are not using powerpath).

If there are no operational issues, I would think of these messages as ignorable as I understand these messages are logged when trying to write operation on disks.

Just on another thought, to isolate the problem, is it possible to try the "raw" mode of fencing instead of DMP mode ?

G

enzo68 · ‎11-27-2013

Hi,

I have no error messages DMP o SCSI.

I tried this procedure http://www.symantec.com/docs/TECH192940

1-hastop -all

2-/etc/init.d/vxfen stop

3-vxdg -o groupreserve -o clearreserve -t import dgFence

4-/etc/init.d/vxfen start

Starting vxfen..
Loaded 2.6.18-128.el5 on kernel 2.6.18-348.el5
WARNING: No modules found for 2.6.18-348.el5, using compatible modules for 2.6.18-128.el5.
Starting vxfen.. Done
Please see the log file /var/VRTSvcs/log/vxfen/vxfen.log

in vxfen.log (VXFEN vxfenconfig NOTICE Driver will use SCSI-3 compliant disks)

5-hastart

6-reboot but the problem remains

Nov 26 12:39:54 mitoora1 kernel: LLT INFO V-14-1-10024 link 2 (bond0) node 1 active
Nov 26 12:39:55 mitoora1 rc: Starting xprtld: succeeded
Nov 26 12:39:55 mitoora1 rc: Starting vxodm: succeeded
Nov 26 12:39:57 mitoora1 kernel: GAB INFO V-15-1-20036 Port a gen 327d08 membership 01
Nov 26 12:39:57 mitoora1 kernel: GAB INFO V-15-1-20036 Port b gen 327d07 membership 01
Nov 26 12:39:57 mitoora1 kernel: sd 7:0:1:1: reservation conflict
Nov 26 12:39:57 mitoora1 kernel: sd 9:0:0:1: reservation conflict
Nov 26 12:39:57 mitoora1 kernel: sd 9:0:1:1: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:1:1: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:0:1: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:1:1: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:0:1: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 7:0:0:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 9:0:1:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 9:0:0:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:1:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:0:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:1:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:0:3: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 7:0:1:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 9:0:0:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 9:0:1:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:1:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 8:0:0:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:0:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: sd 10:0:1:2: reservation conflict
Nov 26 12:39:58 mitoora1 kernel: VXFEN INFO V-11-1-35 Fencing driver going into RUNNING state

and I have not powerpath.

Another server with the cluster up and package up I have the same problem

By log, the problem seems to recur every 20 minutes in /var/log/messages.!!!!!!!!!!!!!!!!!!

Best regards

Vincenzo

Daniel_Matheus · ‎11-27-2013

Hi Enzo,

Did you check what VxVM Disks and OS disks these errors correlate to?

From your vxfenadm output I can see that this is a 2 node cluster with 8paths to the disks.

On each of the disks there are 16 keys so this error shouldn't arise from the coordinator disks.

My best guess is that on some other disks visible to the host are keys from another machine, hence you see the message every discovery cycle.

Maybe some LUNS are zoned to another cluster as well and are in use on this one?

Can you run below command to find out which devices generate the error?

Output would look like

#lsscsi

[3:0:0:0]    disk    IBM      2105             0113 /dev/sdb
[3:0:0:1]    disk    IBM      2105             0113 /dev/sdf
[3:0:0:2]    disk    IBM      2105             0113 /dev/sdh
[3:0:0:3]    disk    IBM      2105             0113 /dev/sdl

Once you know the OS device name see what VxVM device they belong to:

# vxdisk path
SUBPATH                     DANAME               DMNAME       GROUP        STATE
sde                         ibm_shark0_0         -            -            ENABLED
sdb                         ibm_shark0_0         -            -            ENABLED
sdd                         ibm_shark0_0         -            -            ENABLED
sdc                         ibm_shark0_0         -            -            ENABLED

Then run on these disks vxfenadm -s to see the keys.

As stated in the guide, such errors for the coordinator disks should only be seen during boot time, not every 20 minutes.

If it is any other disks (which I assume) then check whether they are in use by other cluster/machine or if the keys are left after an outage for example.

If used by another machine, hide the LUNs and if keys are left over after an outage remove the keys.
It might also be that another application is writing keys on the LUNs.

You might also see this kind of errors if the LUN setting is not r/w, in that case please check with your SAN team or HW vendor.

regards,
Dan

enzo68 · ‎11-27-2013

HI Dan

the key seems to be correct; in attach the output of the commands "vxfenadm -s all -f /etc/vxfentab",

"lltstat -C" (cluster id).

[root@mitoora1 ~]# lsscsi
[0:2:0:0]    disk    IBM      ServeRAID M5110e 3.24 /dev/sda
[2:0:0:0]    cd/dvd IBM SATA DEVICE 81Y3674   IB01 /dev/sr0
[7:0:0:0]    disk    IBM      2145             0000 /dev/sdb
[7:0:0:1]    disk    IBM      2145             0000 /dev/sdc
[7:0:0:2]    disk    IBM      2145             0000 /dev/sdd
[7:0:0:3]    disk    IBM      2145             0000 /dev/sde
[7:0:1:0]    disk    IBM      2145             0000 /dev/sdf
[7:0:1:1]    disk    IBM      2145             0000 /dev/sdg
[7:0:1:2]    disk    IBM      2145             0000 /dev/sdh
[7:0:1:3]    disk    IBM      2145             0000 /dev/sdi
[8:0:0:0]    disk    IBM      2145             0000 /dev/sdj
[8:0:0:1]    disk    IBM      2145             0000 /dev/sdk
[8:0:0:2]    disk    IBM      2145             0000 /dev/sdl
[8:0:0:3]    disk    IBM      2145             0000 /dev/sdm
[8:0:1:0]    disk    IBM      2145             0000 /dev/sdn
[8:0:1:1]    disk    IBM      2145             0000 /dev/sdo
[8:0:1:2]    disk    IBM      2145             0000 /dev/sdp
[8:0:1:3]    disk    IBM      2145             0000 /dev/sdq
[9:0:0:0]    disk    IBM      2145             0000 /dev/sdr
[9:0:0:1]    disk    IBM      2145             0000 /dev/sds
[9:0:0:2]    disk    IBM      2145             0000 /dev/sdt
[9:0:0:3]    disk    IBM      2145             0000 /dev/sdu
[9:0:1:0]    disk    IBM      2145             0000 /dev/sdv
[9:0:1:1]    disk    IBM      2145             0000 /dev/sdw
[9:0:1:2]    disk    IBM      2145             0000 /dev/sdx
[9:0:1:3]    disk    IBM      2145             0000 /dev/sdy
[10:0:0:0]   disk    IBM      2145             0000 /dev/sdz
[10:0:0:1]   disk    IBM      2145             0000 /dev/sdaa
[10:0:0:2]   disk    IBM      2145             0000 /dev/sdab
[10:0:0:3]   disk    IBM      2145             0000 /dev/sdac
[10:0:1:0]   disk    IBM      2145             0000 /dev/sdad
[10:0:1:1]   disk    IBM      2145             0000 /dev/sdae
[10:0:1:2]   disk    IBM      2145             0000 /dev/sdaf
[10:0:1:3]   disk    IBM      2145             0000 /dev/sdag

[root@mitoora1 ~]# vxdisk path
SUBPATH                     DANAME               DMNAME       GROUP        STATE
sda                         disk_0               -            -            ENABLED
sdr                         storwizev70000_000005 -            -            ENABLED
sdv                         storwizev70000_000005 -            -            ENABLED
sdz                         storwizev70000_000005 -            -            ENABLED
sdad                        storwizev70000_000005 -            -            ENABLED
sdf                         storwizev70000_000005 -            -            ENABLED
sdb                         storwizev70000_000005 -            -            ENABLED
sdj                         storwizev70000_000005 -            -            ENABLED
sdn                         storwizev70000_000005 -            -            ENABLED
sds                         storwizev70000_000007 -            -            ENABLED
sdw                         storwizev70000_000007 -            -            ENABLED
sdae                        storwizev70000_000007 -            -            ENABLED
sdaa                        storwizev70000_000007 -            -            ENABLED
sdc                         storwizev70000_000007 -            -            ENABLED
sdg                         storwizev70000_000007 -            -            ENABLED
sdk                         storwizev70000_000007 -            -            ENABLED
sdo                         storwizev70000_000007 -            -            ENABLED
sdy                         storwizev70000_000008 -            -            ENABLED
sdu                         storwizev70000_000008 -            -            ENABLED
sdag                        storwizev70000_000008 -            -            ENABLED
sdac                        storwizev70000_000008 -            -            ENABLED
sdi                         storwizev70000_000008 -            -            ENABLED
sde                         storwizev70000_000008 -            -            ENABLED
sdm                         storwizev70000_000008 -            -            ENABLED
sdq                         storwizev70000_000008 -            -            ENABLED
sdx                         storwizev70000_000009 -            -            ENABLED
sdt                         storwizev70000_000009 -            -            ENABLED
sdaf                        storwizev70000_000009 -            -            ENABLED
sdab                        storwizev70000_000009 -            -            ENABLED
sdd                         storwizev70000_000009 -            -            ENABLED
sdh                         storwizev70000_000009 -            -            ENABLED
sdl                         storwizev70000_000009 -            -            ENABLED
sdp                         storwizev70000_000009 -            -            ENABLED

[root@mitoora1 ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
disk_0       auto:none       -            -                                online invalid-------->internal disk boot
storwizev70000_000005 auto            -            -                error ---->           FREE
storwizev70000_000007 auto:cdsdisk    -            -            online      dgFence    100 Mb
storwizev70000_000008 auto:cdsdisk    -            -            online      dgFence    100 Mb
storwizev70000_000009 auto:cdsdisk    -            -            online      dgFence    100 Mb

Each LUN is seen 8 times, is correct.

Thanks

Vincenzo

Gaurav_S · ‎11-27-2013

Hi.

Can you attach the dmpevents.log file from server ?

Also, Do you know the firmware version of Storage here ? Is should be 4.2.1x or above

G

enzo68 · ‎11-28-2013

Hi Gaurav

in attach the dmpevents.log .

The version Storage V3700 is: 7.1.0 Build 79.8.1307111000

Best regards

Vincenzo

Gaurav_S · ‎11-28-2013

Hi,

the dmpevents suggests that you are receiving reservation errors with DMP on second set of highlighted devices.. i.e

sdy                         storwizev70000_000008 -            -            ENABLED
sdu                         storwizev70000_000008 -            -            ENABLED
sdag                        storwizev70000_000008 -            -            ENABLED
sdac                        storwizev70000_000008 -            -            ENABLED
sdi                         storwizev70000_000008 -            -            ENABLED
sde                         storwizev70000_000008 -            -            ENABLED
sdm                         storwizev70000_000008 -            -            ENABLED
sdq                         storwizev70000_000008 -            -            ENABLED

& if these all are paths to storwizev70000_000008

please give the details of this device

# vxdisk list

# vxdisk -e list

# vxddladm listsupport all

# vxddladm listexclude all

# vxddladm list devices

along with above listed devices, there are many other which are reporting reservation conflicts

I would still recommend to raise a support case & see if they have any fix/patch available for this. I believe they would have one

G

enzo68 · ‎11-28-2013

Hi,

In attach the output files about commands you suggested me. (lists.txt)

Thanks!!!!!!!

Gaurav_S · ‎11-28-2013

All outputs looks OK , ASL is also claiming the devices .. nothing wrong here ..

Do you know what failover mode is set in the array ? DMP has recommendation to run best in ALUA mode (worth looking at this as well)

G

enzo68 · ‎11-29-2013

Hi Gaurav,

The vendor IBM has confirmed to me that the Storwize V3700 type is ALUA.

The problem may be the library?

rpm -qa|grep VRTSaslapm
VRTSaslapm-5.1.134.000-SP1_RHEL5

vxddladm listsupport all |grep -i alua (I don't see IBM alua)
libvxhdsalua.so HITACHI DF600, DF600-V, DF600F, DF600F-V
libvxhpalua.so HP, COMPAQ HSV101, HSV111 (C)COMPAQ, HSV111, HSV200, HSV210, HSV300, HSV400, HSV450, HSV340, HSV360

vxdmpadm list dmpnode all |grep array-type
array-type      = Disk
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC
array-type      = A/A-A-IBMSVC

vxdmpadm listenclosure all
ENCLR_NAME        ENCLR_TYPE     ENCLR_SNO      STATUS       ARRAY_TYPE     LUN_COUNT
=======================================================================================
disk              Disk           DISKS                CONNECTED    Disk        1
storwizev70000    StorwizeV7000 00c020207110XX00     CONNECTED    A/A-A-IBMSVC 10

Best Regards

Vincenzo

Gaurav_S · ‎11-30-2013

Hi ,

Yep, its worth to ask support on this .. as per Symantec in the below article

http://www.symantec.com/business/support/index?page=content&id=TECH47728

page 35 says Storwise arrays are best supported by DMP in ALUA mode ..

& as per below article

http://www.symantec.com/business/support/index?page=content&id=TECH77062 .. there is no addition of ALUA support from change log

& unfortunately this is the last updated ASL/APM software package for Linux .. Support or backend teams can answer if there is an upcoming plan to upgrade libvxibmsvc.so for ALUA support ..

also, if there are any known issues (recently found) can be answered by support ..

All d best

G

enzo68 · ‎12-06-2013

Hi,

this is the answer of the support symantec:

"......As per the discussion with you because of these messages there will be no impact on the functionality of the product.

You may also refer :

http://www.symantec.com/docs/TECH170352

However will try to give the feedback internally so it get addressed in the newer releases."

thank you for the support

have a nice weekend

Vincenzo

Gaurav_S · ‎12-06-2013

Hi

As I mentioned in first post on this thread ,I was in the same opinion that these messages are ignorable (if there are no operational issues) .. was expecting that support would say the same .. however good to have confirmation that its identified bug & would be fixed...

thx for the update ..

G

VOX

Fencing and Reservation Conflict