Monitoring

71 Topics

Fencing and Reservation Conflict
Hi to all I have redhat linux 5.9 64bit with SFHA 5.1 SP1 RP4 with fencing enable ( our storage device is IBM .Storwize V3700 SFF scsi3 compliant [root@mitoora1 ~]# vxfenadm -d I/O Fencing Cluster Information: ================================ Fencing Protocol Version: 201 Fencing Mode: SCSI3 Fencing SCSI3 Disk Policy: dmp Cluster Members: * 0 (mitoora1) 1 (mitoora2) RFSM State Information: node 0 in state 8 (running) node 1 in state 8 (running) ******************************************** in /etc/vxfenmode (scsi3_disk_policy=dmp and vxfen_mode=scsi3) vxdctl scsi3pr scsi3pr: on [root@mitoora1 etc]# more /etc/vxfentab # # /etc/vxfentab: # DO NOT MODIFY this file as it is generated by the # VXFEN rc script from the file /etc/vxfendg. # /dev/vx/rdmp/storwizev70000_000007 /dev/vx/rdmp/storwizev70000_000008 /dev/vx/rdmp/storwizev70000_000009 ****************************************** [root@mitoora1 etc]# vxdmpadm listctlr all CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME ===================================================== c0 Disk ENABLED disk c10 StorwizeV7000 ENABLED storwizev70000 c7 StorwizeV7000 ENABLED storwizev70000 c8 StorwizeV7000 ENABLED storwizev70000 c9 StorwizeV7000 ENABLED storwizev70000 main.cf cluster drdbonesales ( UserNames = { admin = hlmElgLimHmmKumGlj } ClusterAddress = "10.90.15.30" Administrators = { admin } UseFence = SCSI3 ) ********************************************** I configured the coordinator fencing so I have 3 lun in a veritas disk group ( dmp coordinator ) All seems works fine but I noticed a lot of reservation conflict in the messages of both nodes On the log of the server I am constantly these messages: /var/log/messages Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:3: reservation conflict You have any idea? Best Regards Vincenzo
Solved
enzo68
11 years ago Place Cluster Server
11KViews
1like
15Comments
VCS Error Codes for all platforms
Hello Gents, Do we have a list of all erro codes for VCS ? also if all the error codes are generic and are common for all platforms (including Linux,Solaris, Windows,AIX) Need this confirmation urgently, planning to design a common monitoring agent. Best Regards, Nimish
Solved
Nimish_Agarwal
11 years ago Place Cluster Server
6.3KViews
2likes
6Comments
Freeze SG or Freeze Node ?
hi there, I have a scnario where applications are under cluster [2 node cluster].I had an issue in the past where hastop -local was executed before bringing down applications manually which resulted in appsg to go in faulted state. My question is should I freeze the service groups individuallyor should I freeze the nodes while performing OS patch maintainance ? Does freezing node solves the same purpose as that freezing sg's in a perticular node ? Regards, Saurabh
saurabh_pande
12 years ago Place Cluster Server
5.8KViews
1like
9Comments
VOM missing disks
I'm having issues trying to determine the best place to look to determine the reasons why VOM is reporting different information to commands run at the command line. E.G [root@svctesu02 ~]# vxdisk list -o alldgs DEVICE TYPE DISK GROUP STATUS fj_dxm0_000a auto:cdsdisk meth01-fs_dg01 meth01-fs_dg online fj_dxm0_000b auto:cdsdisk - (meth02-fs_dg) online fj_dxm0_000c auto:cdsdisk - (meth03-fs_dg) online fj_dxm0_0006 auto:cdsdisk - (meth01-db_dg) online fj_dxm0_0007 auto:cdsdisk - (meth02-db_dg) online fj_dxm0_0008 auto:cdsdisk meth03-db_dg01 meth03-db_dg online fj_dxm0_0032 auto:cdsdisk - (vxfendg) online fj_dxm0_0033 auto:cdsdisk - (vxfendg) online fj_dxm0_0034 auto:cdsdisk - (vxfendg) online sda auto:none - - online invalid sdb auto:LVM - - LVM sdc auto:LVM - - LVM Yet in VOM I only see the following disk fj_dxm0_000a is not show in VOM but it is there in the system. VOM does show that its Disk Group is imported and Healthy and its Volume is online, but it has no details of the disk, how can this be? It's not causing any issues to the "solution" but it's worrying that it has no information about a disk that is there at the OS, where should I start to look to investigate this further? [root@svctesu02 ~]# vxprint -hrt -g meth01-fs_dg DG NAME NCONFIG NLOG MINORS GROUP-ID ST NAME STATE DM_CNT SPARE_CNT APPVOL_CNT DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK CO NAME CACHEVOL KSTATE STATE VT NAME RVG KSTATE STATE NVOLUME V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE SC NAME PLEX CACHE DISKOFFS LENGTH [COL/]OFF DEVICE MODE DC NAME PARENTVOL LOGVOL SP NAME SNAPVOL DCO EX NAME ASSOC VC PERMS MODE STATE SR NAME KSTATE dg meth01-fs_dg default default 28000 1439298345.37.svctesu01.XXXXXX.local dm meth01-fs_dg01 fj_dxm0_000a auto 65536 3867082448 - v meth01-fs - ENABLED ACTIVE 3867080704 SELECT - fsgen pl meth01-fs-01 meth01-fs ENABLED ACTIVE 3867080704 CONCAT - RW sd meth01-fs_dg01-01 meth01-fs-01 meth01-fs_dg01 0 3867080704 0 fj_dxm0_000a ENA
Solved
StoneRam-Simon
9 years ago Place Management Solutions
4.6KViews
0likes
3Comments
Space is almost full on host that stores the performance statistics
What does VOM server fault condition: Space is almost full on host that stores the performance statistics mean. I am seeing this on the serverfault tab in VOM fora server that is NOT the VOM server and if I drill down to the host that shows the issue there are no filesystems that have alerts on them and the VOM server has DB on a filesystem which is only 15% used. Mike
Solved
mikebounds
10 years ago Place Management Solutions
4.3KViews
0likes
3Comments
Is there any tool which can inform us about over all healthe of veritas cluster
Hi Everybody ! I want to know is there any tool by which we come to know the over all healthe of cluster nodes or the one which can analyze the vxexplorer ?
Solved
Pranali
15 years ago Place Cluster Server
3.8KViews
0likes
4Comments
Why does 1 subpath in multipathing use slice 2 and not the other??
Wondering why disk 2 show slice 2 whereas the other subpaths show only disk? Does this mean that this disk has been formatted/labeled differently? They all should be labeled EFI. [2250]$ vxdisk path SUBPATH DANAME DMNAME GROUP STATE c0t0d0s2 disk_0 - - ENABLED c3t500601683EA04599d0 storageunit1 - - ENABLED c3t500601613EA04599d0 storageunit1 - - ENABLED c2t500601603EA04599d0 storageunit1 - - ENABLED c2t500601693EA04599d0 storageunit1 - - ENABLED c3t500601613EA04599d2s2 storageunit2 - - ENABLED c3t500601683EA04599d2s2 storageunit2 - - ENABLED c2t500601693EA04599d2s2 storageunit2 - - ENABLED c2t500601603EA04599d2s2 storageunit2 - - ENABLED c3t500601613EA04599d1 storageunit3 - - ENABLED c3t500601683EA04599d1 storageunit3 - - ENABLED c2t500601603EA04599d1 storageunit3 - - ENABLED c2t500601693EA04599d1 storageunit3 - - ENABLED When I attempt to initialise the LUN I get this error. [2318]$ vxdisksetup -i storageunit2 vxedvtoc: No such device or address I can see from this output that paths are disabled. [2303]]$ vxdmpadm -v getdmpnode all NAME STATE ENCLR-TYPE PATHS ENBL DSBL ENCLR-NAME SERIAL-NO ARRAY_VOL_ID ======================================================================================================== disk_0 ENABLED Disk 1 1 0 disk 600508B1001C0CB883CB65D7A794AC54 - storageunit1 ENABLED EMC_CLARiiON 4 4 0 emc_clariion0 60060160D2202F004C1C9AB085F2E111 1 storageunit2 ENABLED EMC_CLARiiON 4 2 2 emc_clariion0 60060160D2202F00D6FACBE385F2E111 5 storageunit3 ENABLED EMC_CLARiiON 4 4 0 emc_clariion0 60060160D2202F00EE397F2986F2E111 9
Solved
manmountain
12 years ago Place Dynamic Multi-Pathing and Dynamic Multi-Pathing for VMware
3.4KViews
1like
2Comments
Disks name not match within two node cluster
Hi, I have two node cluster and today I ran Virtual Fire Drill in order to see if everything is OK. I've noticed I'm getting errors such as: * udid.vfd: Failed to get disk information for disk san_vc0_13. * udid.vfd: UDIDS for device <san_vc_8 do not match on cluster nodes. [...] Anyway, I've connected to both servers, san_vc0_13 doesn't exist on the second node. About the other errors the disks name is not match. Reviewing this case, I've seen we have 2 more temporary disks that the first node has, that has nothing to do with the cluster (this situation is OK). the problem is, that these 2 disks, took the number 2 and 3 (san_vc0_2/3), so now the disks names are mismatch and it seems I'm missing a disk on the second node. the cluster diskgroups are viewed by both nodes and they are OK (even within the cluster), the only problem is the names are not match and VFP is warning about it. Anyway to change the disk's names and make it permanent (by UDID or so) ? Thanks,
Solved
tzabary
13 years ago Place Storage Foundation
2.9KViews
1like
10Comments
VOM snmptrap community string
I'm planning for new VOM implementation on Solaris, but currently doing POC on window server 2008. I can see SNMP Trap settings only has SNMP server and SNMP port, but is there a way to add trapcommunity string ?
kd001
9 years ago Place Management Solutions
2.6KViews
1like
1Comment
Problem with VRTS . oracle alram
hi all , recently i come up against oracle issue ,i'm getting the next error message : VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the expected time. my question is why this error can appear? this is also causing to a failover of the servers and the database changing to FAULTED state DBA solution was to adjust interval/timeout values and is might help ,but i want to anlayze this problem and unstrstand why this is hapenning on my system . I have atteached logs and useful information . if someone can help me with this Thx ,
Solved
bonny6
14 years ago Place Cluster Server
2.6KViews
0likes
17Comments