Forum Discussion

ebotang's avatar
ebotang
Level 3
9 years ago

cfsmount1 &cfsmount2 resource could not offline

I met a problem about vcs.

 

Environment:

HW T5220 Server *2 + ax4-5;

Problem description:

 

When executing “init 6” or “hastop –all” command in cluster system, resource cfsmount1&cfsmount2 could not been offline normally;

 

Checked with the HW state(EMC connective state ,disk, system,  iostat –En),the output of “vxdisk , vxprint, vxdmpadm, fuser, mount –v etc.”

 

I tried to umount  /var/opt/mediation/MMStorage manually, it did not succeed  and it look like the process  has hung up;

 

Please see check list in attach file check_point.log  , engine_A.log and main.cf .

 

Could you give me some advice about how to fix the problem?

9 Replies

  • Hi, 

      Not find attached file.

      if cfsmount resource can't offline, normally need check if some application access the file system.

     like fuser -uc /mountpoint.

    Before offline cfsmount resouce, make sure application resource has offline on that file system.

     

      if manual umount cfsmount ,you need think about mount lock.

  • Hi,

    Not sure where attachments are, I did see them & then they were gone. From the logs, I see that cfsmount couldn't offline in time however later clean of resource succeeded

    2016/05/25 16:45:52 VCS WARNING V-16-2-13011 (HBCG14BER) Resource(cfsmount1): offline procedure did not complete within the expected time

    2016/05/25 16:45:52 VCS WARNING V-16-2-13011 (HBCG14BER) Resource(cfsmount2): offline procedure did not complete within the expected time.

     

    2016/05/25 16:45:53 VCS INFO V-16-2-13068 (HBCG14BER) Resource(cfsmount2) - clean completed successfully.
    2016/05/25 16:45:53 VCS INFO V-16-2-13068 (HBCG14BER) Resource(cfsmount1) - clean completed successfully.

     

    two things I can suggest ..

    1. Check if the app was offline, that it was not writing to filesystem anymore.

    2. As a test, you can try with increasing timeout of cfsmount resource (hares -modify MonitorTimeout .. )

    3. As clean is able to succeed later, I have strong belief that something is holding up the filesystem which is cleaned by clean procedure. you may need to dig more from filesystem prospective to understand what is holding / accessing filesystem.

     

    G

  • when timeout occured i tired fuser -c /mountpoint , but no response.

    and i also try umount /mountpoint & unmount -o mntunlock=vcs /mountpoint. it does not succeed.

    :(((

  • What was the error that you got when you ran:

    unmount -o mntunlock=vcs /mountpoint

    "mntunlock=vcs" --> Is it a typo? VCS should be in caps.

    If you used caps then also it could not unmount then something is holding up the filesystem, which needs to be figured out.

     

  • Hi As you said that something is holding up the filesystem, then the fuser command should be show the process that use the filesystem; but in actually when execute fuser -c /mountpoint ,there is nothing showed on screen; # umount /var/opt/mediation/MMStorage ^C # fuser -c /var/opt/mediation/MMStorage /var/opt/mediation/MMStorage: ^C # fuser -c /var/opt/mediation/MMDB /var/opt/mediation/MMDB: ^C # # Ctrl +C command is used when nothing returned; Do you have any other suggestion? thank you !
  • If fuser is hanging that does not mean FS is not being held up. Can you give us below outputs:

    uname -a

    pkginfo -l VRTSvcs

    pkginfo -l VRTSvxvm

    pkginfo -l VRTSvxfs

    modinfo | grep -i vx

     

    df -k

    mount -p

    fuser -cu /<mount point>

    fuser -fu /<mount point>

    fuser -ku /<mount point>

     

    If fuser hangs dont kill the process, follow the below technote and keep your evidences ready in case you need to open a suppor case to analyze live core and truss output of hung processes.

    https://www.veritas.com/support/en_US/article.000020115

     

     

  • beforeexecute  hastop -all

    # uname -a
    SunOS HBCG13BER 5.10 Generic_150400-18 sun4v sparc SUNW,SPARC-Enterprise-T5220
    # pkginfo -l VRTSvcs
       PKGINST:  VRTSvcs
          NAME:  Veritas Cluster Server by Symantec
      CATEGORY:  system
          ARCH:  sparc
       VERSION:  5.1
       BASEDIR:  /
        VENDOR:  Symantec Corporation
          DESC:  Veritas Cluster Server by Symantec
        PSTAMP:  5.1.104.000-5.1SP1RP4-2013-08-08_16.00.00
      INSTDATE:  Jan 09 2016 12:44
        STATUS:  completely installed
         FILES:      284 installed pathnames
                      28 shared pathnames
                      61 directories
                     105 executables
                  237447 blocks used (approx)

    # pkginfo -l VRTSvxvm
       PKGINST:  VRTSvxvm
          NAME:  Binaries for VERITAS Volume Manager by Symantec
      CATEGORY:  system
          ARCH:  sparc
       VERSION:  5.1,REV=10.06.2009.22.05
       BASEDIR:  /
        VENDOR:  Symantec Corporation
          DESC:  Virtual Disk Subsystem
        PSTAMP:  5.1.104.000-5.1SP1RP4-2013-08-07-142629-19
      INSTDATE:  Jan 09 2016 12:35
       HOTLINE:  http://support.veritas.com/phonesup/phonesup_ddProduct_.htm
         EMAIL:  support@veritas.com
        STATUS:  completely installed
         FILES:      955 installed pathnames
                      44 shared pathnames
                     116 directories
                     428 executables
                  419176 blocks used (approx)

    # pkginfo -l VRTSvxfs
       PKGINST:  VRTSvxfs
          NAME:  VERITAS File System
      CATEGORY:  system,utilities
          ARCH:  sparc
       VERSION:  5.1,REV=7Oct2009
       BASEDIR:  /
        VENDOR:  VERITAS Software
          DESC:  Commercial File System
        PSTAMP:  5.1.104.000-5.1SP1RP4-2013-08-12-FS-142634-13
      INSTDATE:  Jan 09 2016 12:41
       HOTLINE:  (800) 342-0652
         EMAIL:  support@veritas.com
        STATUS:  completely installed
         FILES:      332 installed pathnames
                      35 shared pathnames
                       4 linked files
                      53 directories
                     108 executables
                  117239 blocks used (approx)

    # modinfo | grep -i vx
     34 7be00000  51628 361   1  vxdmp (VxVM 5.1SP1RP4 DMP Driver)
     35 7ba00000 221b08 362   1  vxio (VxVM 5.1SP1RP4 I/O driver)
     37 7be48df0   11a8 363   1  vxspec (VxVM 5.1SP1RP4 control/status d)
    244 7af153c0    d40 364   1  vxportal (VxFS 5.1SP1RP4 portal driver)
    245 7a200000 206470  21   1  vxfs (VxFS 5.1SP1RP4 SunOS 5.10)
    248 79e00000  6d760 368   1  vxfen (VRTS Fence 5.1SP1RP4)
    249 7af16000  24f90 369   1  vxglm (VxGLM 5.1_SP1RP2P1 SunOS 5.10)
    250 7a7c2000   48a8 370   1  vxgms (VxGMS 5.1_SP1 (Solaris 5.10))
    266 7a7e6000   baa8 365   1  fdd (VxQIO 5.1SP1RP4 Quick I/O drive)
    # df -k
    Filesystem            kbytes    used   avail capacity  Mounted on
    /dev/md/dsk/d0       108687644 18906254 88694514    18%    /
    /devices                   0       0       0     0%    /devices
    ctfs                       0       0       0     0%    /system/contract
    proc                       0       0       0     0%    /proc
    mnttab                     0       0       0     0%    /etc/mnttab
    swap                 56583952    1904 56582048     1%    /etc/svc/volatile
    objfs                      0       0       0     0%    /system/object
    sharefs                    0       0       0     0%    /etc/dfs/sharetab
    /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr/libc_psr_hwcap2.so.1
                         108687644 18906254 88694514    18%    /platform/sun4v/lib/libc_psr.so.1
    /platform/SUNW,SPARC-Enterprise-T5220/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
                         108687644 18906254 88694514    18%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
    fd                         0       0       0     0%    /dev/fd
    swap                 56582312     264 56582048     1%    /tmp
    swap                 56582080      32 56582048     1%    /var/run
    swap                 56582048       0 56582048     0%    /dev/vx/dmp
    swap                 56582048       0 56582048     0%    /dev/vx/rdmp
    /dev/odm                   0       0       0     0%    /dev/odm
    /dev/vx/dsk/mmdbdg/vol01
                         209673216  329715 196260240     1%    /var/opt/mediation/MMDB
    /dev/vx/dsk/mmdatadg/vol01
                         13507384320 3861851 13235095333     1%    /var/opt/mediation/MMStorage
    #
    #
    # mount -p
    /dev/md/dsk/d0 - / ufs - no rw,intr,largefiles,logging,xattr,onerror=panic
    /devices - /devices devfs - no
    ctfs - /system/contract ctfs - no
    proc - /proc proc - no
    mnttab - /etc/mnttab mntfs - no
    swap - /etc/svc/volatile tmpfs - no xattr
    objfs - /system/object objfs - no
    sharefs - /etc/dfs/sharetab sharefs - no
    /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr/libc_psr_hwcap2.so.1 - /platform/sun4v/lib/libc_psr.so.1 lofs - no
    /platform/SUNW,SPARC-Enterprise-T5220/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1 - /platform/sun4v/lib/sparcv9/libc_psr.so.1 lofs - no
    fd - /dev/fd fd - no rw
    swap - /tmp tmpfs - no xattr
    swap - /var/run tmpfs - no xattr
    swap - /dev/vx/dmp tmpfs - no xattr
    swap - /dev/vx/rdmp tmpfs - no xattr
    /dev/odm - /dev/odm odm - no smartsync
    /dev/vx/dsk/mmdbdg/vol01 - /var/opt/mediation/MMDB vxfs - no rw,suid,delaylog,largefiles,qio,cluster,ioerror=mdisable,crw,mntlock=VCS
    /dev/vx/dsk/mmdatadg/vol01 - /var/opt/mediation/MMStorage vxfs - no rw,suid,delaylog,largefiles,qio,cluster,ioerror=mdisable,crw,mntlock=VCS
    #
    #
    # fuser -cu /var/opt/mediation/MMStorage/
    /var/opt/mediation/MMStorage/:     6583om(root)    5985om(root)    5853o(root)    5729o(root)    5605o(root)
    # fuser -fu /var/opt/mediation/MMStorage/
    /var/opt/mediation/MMStorage/:
    # fuser -ku /var/opt/mediation/MMStorage/
    /var/opt/mediation/MMStorage/:

     

    after execute hastop -all

    # fuser -cu /var/opt/mediation/MMStorage
    /var/opt/mediation/MMStorage: fuser: Invalid argument
    #

  • # fuser -cu /var/opt/mediation/MMStorage
    /var/opt/mediation/MMStorage: fuser: Invalid argument
    #

     

    The possible reason for hitting the error "Invalid argument" when the FS is already unmounted. Did you check using df command if the FS is already unmounted?

  • I would say first lets try to segregate the layer of the issue .. is it the issue with the mount / app accessing or any HA component.

    Can you try bringing up CVM, mounting the filesystems using CFSmount command, start the app. Then stop application & see if you are able to offline the mount using cfsumount command. This will help us to figure if the issue is happening at HA layer or FS layer. If the cfsumount succeeds, I would suggest to repeat the procedure & time the cfsumount command.

    Basis on the time taken by cfsumount, you may need to adjust the monitortimeout of cfsmount resources to ensure sufficient time is given for VCS to stop the mount resource.


    G