cancel
Showing results for 
Search instead for 
Did you mean: 

Disk Group Import Slow

tzvb23
Level 2

We're seeing disk group import times of 5+ minutes for disk groups that contain more than 25 LUN's.

I think the issue is related to the number of paths per LUN, which is 4, so VxVM is potentially scanning for N x 4 paths?

Has anyone else seen this behavior and more impotantly what was done to reduce the disk group import times?

Environment

  • SF HA Enterprise v6.1 and EMC PowerPath v5.5 SP2
  • Solaris 10 and Solaris 11.
  • EMC V-Max storage
  • Each disk group has ~25 LUNs, each LUN has 4x paths, 3x disk groups per host.
  • Using VCS for HA and failover.
6 REPLIES 6

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello,

I wouldn't expect a time of 5 mins regardless of number of paths. Even if 2 paths are visible or 1 path is visible if configuration disks are accessible, diskgroup should import.

I am believing that array is configured as A/A. EMC power path 5.5 was supported in SF 5.1 onwards so here also I don't see a challenge.Check the firmware version though in HCL if it meets minumim requirement

http://www.symantec.com/docs/TECH47728

Also, paste output of

# vxdmpadm gettune all

I am specifically interested to check value of dmp_monitor_osevent

with EMC PowerPath, Symantec recommends setting this parameter to off to avoid any issues

 

G

tzvb23
Level 2

V-Max Enginuity Build Version: 5876.229.145

# vxdmpadm gettune all
            Tunable               Current Value  Default Value
------------------------------    -------------  -------------
dmp_cache_open                           on               on
dmp_daemon_count                         10               10
dmp_delayq_interval                      15               15
dmp_fast_recovery                        on               on
dmp_health_time                          60               60
dmp_log_level                             1                1
dmp_low_impact_probe                     on               on
dmp_lun_retry_timeout                     0                0
dmp_path_age                            300              300
dmp_pathswitch_blks_shift                 9                9
dmp_probe_idle_lun                       on               on
dmp_probe_threshold                       5                5
dmp_restore_cycles                       10               10
dmp_restore_interval                    300              300
dmp_restore_policy             check_disabled   check_disabled
dmp_restore_state                   enabled          enabled
dmp_retry_count                           5                5
dmp_scsi_timeout                         30               30
dmp_sfg_threshold                         1                1
dmp_stat_interval                         1                1
dmp_monitor_ownership                    on               on
dmp_native_multipathing                 off              off
dmp_monitor_fabric                       on               on
dmp_monitor_osevent                     off               on
dmp_native_support                      off              off

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Have you excluded controllers connected to EMC arrays from DMP control?

Check if these controller(s) are listed in /etc/vx/vxdmp.exclude.

Double-check that they are excluded by running 'vxdmpadm listexclude'.

Also check /var/adm/messages for errors/issues during diskgroup import.

PS:
You forgot to tell us SF/HA version (including patch level)?

tzvb23
Level 2

We haven't explicitly excluded any controllers from DMP control but I'll try that and see if it makes a difference in the DG import time. The version of SFHA that I'm working with now is v6.1 (no patches).

But I've seen the same issue in SFHA v5.1 in other environments, whenever there is a large number of disks presented to a host the vxdisk scandisk command takes a long time to complete and subsequently the DG import performance is slow as well.

# vxdmpadm listexclude
Devices excluded from VxVM:
--------------------------

Paths : None

Controllers : None

VID:PID : None

--------------------------

Gaurav_S
Moderator
Moderator
   VIP    Certified

I would recommend to get this analyzed thoroughly .. I have seen env with 1000s of disks but deport/import operations are very smooth without delays.

Best way I would recommmend is, once you import a diskgroup & its stuck, try to capture threadlists

(# echo "$<threadlist" | mdb -k > /var/tmp/tlist1 )

Take couple of them while the import operation continues (make sure to change the output file name for each run) & get this analyzed with Symantec support.

Ideally a crash dump is best source to look in, once import is going ,if you can take entire dump of memory (halt -d ), this will cause outage but will dump a crash & can have better evidence to be analyzed by Symante support. Make sure you have solaris dump devices ready though. If incase you can't go with crash dump, try to go with threadlists as suggested above

G

al_from_indiana
Level 3

Make sure if there's a "access lun" to remove it from the host's "host group" or equivalent as dmp will attempt to scan it on bootup that'll cause a delay.