Highlighted

The behavior of VxDMP when one of its multi-path down

When we test the VxDMP 5/6 with default minimumq for both VxVM and LVM via the same HDS G1K and vxbench program on AIX 5/6, a strange phenomenon appeared!
That is, once one of the VxDMP for VxVM multipath was broken down, the I/Os of all the others good links would hang up with for a short time(15s). versus the VxDMP for native LVM would contiune the remaining I/Os of all the others good links wittout any disruption! 

So how to explain such phenomenon?

14 Replies
Highlighted

Re: The behavior of VxDMP when one of its multi-path down

@liuyl

If you believe there is a problem with VxDMP, then please log a call with Veritas Support.

They will need to analyze your config and logs to see why this happens.

Your experience is certainly not normal. 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

there are many parameters and tunables for Vxvm, dmp and scsi controllers.

https://sort.veritas.com/public/documents/dmp/5.1sp1/aix/productguides/html/dmp_admin/ch08s02.htm

https://sort.veritas.com/public/documents/sfha/6.0/linux/productguides/html/sf_admin/ch32s02s02s01.h...

If you are not very familiar with tuing vxvm/dmp and troubleshooting the issue, collect the data below and log a support case with veritas.

1. a detailed description of your tests (including the names of the volumes and file systems involved)

2. VRTSexplorer(DataCollector) report

3. Firstlook report (must be collected during the tests.  https://www.veritas.com/support/en_US/article.000038518)

Likely an escalation of the case is required.

Cheers,

Frank

 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

It is too unfortunate that the case problem has almost sustained for one year,  but still not resolved!
My local region technical support is so weaker that they could only admit the fact at last,  but could not give any more reasonable explanation!

Note: or if you prefer, I shall give you the CASE ID !

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Do this without delay.

Call Veritas support hotline during Australia business hours and request the case to be escalated to a backlineline immediately.

Before calling, make sure that you have collected the data that I requested you to collect in my previous node. 

Of course you need to check and make sure your hardware is supporetd https://sort.veritas.com/hcl

You can also send me the case ID

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Thank you very much!!
If you would like, I can foward all the corresponding CASE ID 180718-000720 e-mails to you!

Notes: and also you would see all the parts of the corresponding CASE ID 180718-000720 e-mails with public account Enterprise Support!

 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

This is not a valid case ID 180718-000720.  In the past, a valid case IDs was 9 digiits then later changes to 8 digits.

If you like, you can send the email coorespondence as an email attachment so that I can review.

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

@liuyl

Please log a call with Veritas Support.

Why wait so long when you realize that your 'local region technical support' is useless?

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Hi, Please paste o/p of below command here. That would be helpful

1. vxprint

2. vxdmpadm listenclosure all

3. version of VxVM/VxDMP packages

For LVM volumes as well, if possible please provide the layout of LVM volume. For LVM i am assuming you are using VxDMP for multi-pathing purpose (DMP native support feature)

Thanks,

Sumit

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# lslpp -l|grep -Ei "vxvm|vxfs"
Files for VxFS by Symantec
VRTSvxfs 6.0.500.0 APPLIED Veritas File System by
VRTSvxvm 6.0.500.0 APPLIED Veritas Volume Manager by
VRTSvxfs 6.0.500.0 APPLIED Veritas File System by
VRTSvxvm 6.0.500.0 APPLIED Veritas Volume Manager by
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxdmpadm listenclosure all
ENCLR_NAME ENCLR_TYPE ENCLR_SNO STATUS ARRAY_TYPE LUN_COUNT
=======================================================================================
disk Disk DISKS CONNECTED Disk 2
tagmastore-usp0 TagmaStore-USP 50461 CONNECTED A/A 3
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxdmpadm getattr enclosure tagmastore-usp0 iopolicy
ENCLR_NAME DEFAULT CURRENT
============================================
tagmastore-usp0 MinimumQ MinimumQ
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxdisk -eoalldgs list
DEVICE TYPE DISK GROUP STATUS OS_NATIVE_NAME ATTR
disk_0 auto:LVM - - LVM hdisk1 -
disk_1 auto:LVM - - LVM hdisk0 -
tagmastore-usp0_3f2b auto:cdsdisk testdg02_01 testdg02 online hdisk10 lun fc RAID_1
tagmastore-usp0_3f2c auto:cdsdisk testdg02_02 testdg02 online hdisk8 lun fc RAID_1
tagmastore-usp0_3f04 auto:LVM - - LVM hdisk9 lun fc RAID_1
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxprint -g testdg02 -pvsd
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dm testdg02_01 hdisk3 - 41875568 - - - -
dm testdg02_02 hdisk4 - 41875568 - - - -
sd testdg02_01_sd01 test02vol-01 ENABLED 41870000 0 - - -
pl test02vol-01 test02vol ENABLED 41870000 - ACTIVE - -
v test02vol fsgen ENABLED 41870000 - ACTIVE - -
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxdmpadm gettune dmp_native_support
Tunable Current Value Default Value
------------------------------ ------------- -------------
dmp_native_support on off
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# vxdisk list tagmastore-usp0_3f04|tail -6
udid: HITACHI%5FOPEN-V%5F0C51D%5F3F04
site: -
Multipathing information:
numpaths: 2
hdisk9 state=enabled
hdisk2 state=enabled
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testlv jfs2 400 400 1 closed/syncd /testjfs2
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# nohup /opt/VRTSspt/FS/VxBench/vxbench -w write -i maxfilesize=2000m,nthreads=10,iosize=512,nrep=280000 /dev/rtestlv &
[1] 3736202
root@jcsggissrv2:/# Sending nohup output to nohup.out.

root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# iostat -a 1 66|grep -i fcs
fcs0 792167.9 3094.4 0 787712
fcs1 792167.9 3094.4 0 787712
fcs0 786615.3 3072.7 0 791040
fcs1 786360.7 3071.7 0 790784
fcs0 788282.0 3079.2 0 789760
fcs1 788793.0 3081.2 0 790272
fcs0 786106.2 3070.7 0 790528
fcs1 785851.6 3069.7 0 790272
fcs0 786708.3 3073.1 0 787200
fcs1 786964.1 3074.1 0 787456
fcs0 790969.4 3089.7 0 788992
fcs1 790969.4 3089.7 0 788992
fcs0 1535.0 6.0 0 1536
fcs1 790034.2 3086.1 0 790528
fcs0 0.0 0.0 0 0
fcs1 793600.0 3100.0 0 793600
fcs0 0.0 0.0 0 0
fcs1 793055.3 3097.9 0 792064
fcs0 0.0 0.0 0 0
fcs1 793856.0 3101.0 0 793856
fcs0 0.0 0.0 0 0
fcs1 794352.5 3102.9 0 793856
fcs0 0.0 0.0 0 0
fcs1 785996.0 3070.3 0 793856
fcs0 0.0 0.0 0 0
fcs1 793856.0 3101.0 0 793856
fcs0 0.0 0.0 0 0
fcs1 793360.1 3099.1 0 793856
fcs0 0.0 0.0 0 0
fcs1 793856.0 3101.0 0 793856
fcs0 0.0 0.0 0 0
fcs1 792353.6 3095.1 0 793344
fcs0 0.0 0.0 0 0
fcs1 792114.8 3094.2 0 793600
fcs0 0.0 0.0 0 0
fcs1 792864.9 3097.1 0 793856
fcs0 0.0 0.0 0 0
fcs1 790652.9 3088.5 0 794112
fcs0 0.0 0.0 0 0
fcs1 793856.0 3101.0 0 793856
fcs0 0.0 0.0 0 0
fcs1 795347.3 3106.8 0 793856
fcs0 0.0 0.0 0 0
fcs1 797059.5 3113.5 0 792576
fcs0 0.0 0.0 0 0
fcs1 790890.2 3089.4 0 793856
fcs0 94483.8 369.1 0 94720
fcs1 789067.3 3082.3 0 791040
fcs0 787064.5 3074.5 0 790016
fcs1 787319.6 3075.5 0 790272
fcs0 786708.3 3073.1 0 787200
fcs1 786708.3 3073.1 0 787200
fcs0 785982.3 3070.2 0 787456
fcs1 785982.3 3070.2 0 787456
fcs0 789778.4 3085.1 0 790272
fcs1 789522.5 3084.1 0 790016
fcs0 785450.2 3068.2 0 786432
fcs1 785705.9 3069.2 0 786688
fcs0 787024.4 3074.3 0 788992
fcs1 787535.2 3076.3 0 789504
fcs0 786340.3 3071.6 0 790272
fcs1 786595.0 3072.6 0 790528
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
[1] + Done nohup /opt/VRTSspt/FS/VxBench/vxbench -w write -i maxfilesize=2000m,nthreads=10,iosize=512,nrep=280000 /dev/rtestlv &
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# nohup /opt/VRTSspt/FS/VxBench/vxbench -w write -i maxfilesize=2000m,nthreads=10,iosize=512,nrep=280000 \
> /dev/vx/dsk/testdg02/test02vol &
[1] 4719214
root@jcsggissrv2:/# Sending nohup output to nohup.out.

root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/# iostat -a 1 66|grep -i fcs
fcs0 66861.0 12773.7 0 73756
fcs1 66498.4 12769.2 0 73356
fcs0 72667.7 14054.6 0 72804
fcs1 72743.6 14081.6 0 72880
fcs0 73169.8 14150.8 0 72804
fcs1 72739.7 14158.8 0 72376
fcs0 73338.2 14101.2 0 73384
fcs1 73434.1 14081.2 0 73480
fcs0 73442.5 14017.2 0 73672
fcs1 73410.6 14024.2 0 73640
fcs0 59085.2 11385.7 0 59196
fcs1 59460.5 11433.6 0 59572
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 0.0 0.0 0 0
fcs0 0.0 0.0 0 0
fcs1 41709.5 4400.0 0 47236
fcs0 0.0 0.0 0 0
fcs1 124472.0 12910.0 0 124472
fcs0 0.0 0.0 0 0
fcs1 121464.0 13007.0 0 121464
fcs0 0.0 0.0 0 0
fcs1 122604.0 12989.0 0 122604
fcs0 0.0 0.0 0 0
fcs1 119968.8 13152.4 0 119144
fcs0 0.0 0.0 0 0
fcs1 118618.9 12804.9 0 120324
fcs0 0.0 0.0 0 0
fcs1 122171.6 12987.9 0 122248
fcs0 0.0 0.0 0 0
fcs1 118014.4 13015.7 0 118752
fcs0 0.0 0.0 0 0
fcs1 124028.6 12996.4 0 123796
fcs0 0.0 0.0 0 0
fcs1 122847.0 13096.7 0 122156
fcs0 0.0 0.0 0 0
fcs1 123069.9 12869.5 0 123916
fcs0 0.0 0.0 0 0
fcs1 122083.4 13002.7 0 122236
fcs0 0.0 0.0 0 0
fcs1 119637.8 12990.0 0 120236
fcs0 0.0 0.0 0 0
fcs1 121196.0 13045.0 0 121196
fcs0 0.0 0.0 0 0
fcs1 120080.9 13033.9 0 120156
fcs0 0.0 0.0 0 0
fcs1 121027.4 13291.4 0 119212
fcs0 0.0 0.0 0 0
fcs1 121672.2 13053.6 0 121368
fcs0 0.0 0.0 0 0
fcs1 121876.0 12989.0 0 121876
root@jcsggissrv2:/#
root@jcsggissrv2:/#
root@jcsggissrv2:/#

 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Hi,

        Based on the vxbench command line provided, small clarification is required -

 

LVM - vxbench is ran on RAW device interface /dev/rtestlv 

VxVM - here you are using block device interface (/dev/vx/dsk/) and not RAW (/dev/vx/rdsk/) 

Any reason for the same? I think performing IOs on raw and block interface is slighly different. Hence asking the same.

If possible please try to repeat the test on raw VxVM device (/dev/vx/rdsk/testdg02/test02vol) and check the result once again.

 

Thanks,

Sumit

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

In fact, I have tested too many times with both block and raw devices, and also all the same results!
Notes: I said before that my local region technical supports had confirmed the same phenomenons with their own lab tests!

 

 

 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Recently I have tested this problem with the VxVM more again, and find such "regularity"!
That is,  the I/O broken-down with raw devices is indeed much less than with block devices!

Notes: the I/O broken-down with raw devices has just 20% probability,  versus the 100% inevitability with block devices!

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

If you think this is a defect, you need to have your support case escalated to backline/engineering so the issue can be investigayed by VxVM/dmp engineering.   engineering may request you to upgrade your Veritas software to the latest version/patch level as a fix may already included in the latest patch.

 

Anyway, have your case escalated to backline/engineering asap

 

Highlighted

Re: The behavior of VxDMP when one of its multi-path down

Now I have reopen the new case ID with 190226-000193, and it has been put in the Global Technical Support Queue !
So I hope you can receive the corresponding e-mail that sent to "Enterprise Technical Support"<enterprise_technical_support_veritas@custhelp.com> as soon as possible!