MDSU1b: ======= [root@mdsu1b log]# tail -20 /var/VRTSvcs/log/engine_A.log 2010/07/13 21:02:06 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the expected time. 2010/07/13 21:02:06 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (4) consecutive times. 2010/07/13 21:04:33 VCS INFO V-16-1-10306 Resource mdsuData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/07/13 21:04:33 VCS INFO V-16-1-10306 Resource mdsuDbData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/07/13 21:04:33 VCS INFO V-16-1-10306 Resource mdsuOracleLog_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/07/17 13:33:47 VCS INFO V-16-1-10306 Resource listener (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/07/29 05:31:48 VCS ERROR V-16-2-13027 (mdsu1a) Resource(Notifier) - monitor procedure did not complete within the expected time. 2010/07/29 05:34:08 VCS INFO V-16-1-10306 Resource Notifier (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/07/29 06:30:05 VCS WARNING V-16-1-17049 Notifier:Message 81 is being deleted since message cannot be delivered for more than a hour 2010/08/10 04:58:06 VCS ERROR V-16-2-13027 (mdsu1a) Resource(EdnProxy) - monitor procedure did not complete within the expected time. 2010/08/10 04:58:07 VCS INFO V-16-2-13026 (mdsu1a) Resource(EdnProxy) - monitor procedure finished successfully after failing to complete within the expected time for (3) consecutive times. 2010/09/11 10:27:59 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:27:59 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:27:59 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:27:59 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure finished successfully after failing to complete within the expected time for (2) consecutive times. 2010/09/11 10:27:59 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (5) consecutive times. 2010/09/11 10:27:59 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (3) consecutive times. 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuDbData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuOracleLog_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) [root@mdsu1b log]# [root@mdsu1b ~]# tail -5 /var/VRTSvcs/log/engine_A.log 2010/09/11 10:27:59 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (3) consecutive times. 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuDbData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:30:25 VCS INFO V-16-1-10306 Resource mdsuOracleLog_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/12 20:30:27 VCS INFO V-16-1-10307 Resource Mount_data_backup (Owner: unknown, Group: APPDB) is offline on mdsu1a (Not initiated by VCS) [root@mdsu1b ~]# hares -state Mount_data_backup #Resource Attribute System Value Mount_data_backup State mdsu1a OFFLINE Mount_data_backup State mdsu1b ONLINE MDSU1a: ======= [root@mdsu1a ~]# tail -20 /var/VRTSvcs/log/engine_A.log 2010/08/10 04:54:05 VCS WARNING V-16-1-10023 Agent Proxy not sending alive messages since Tue 10 Aug 2010 04:51:39 AM MVT 2010/08/10 04:54:05 VCS NOTICE V-16-1-53026 Agent Proxy ipm connection still valid 2010/08/10 04:54:05 VCS NOTICE V-16-1-53030 Termination request sent to Proxy agent process with pid 11668 2010/08/10 04:54:05 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/Proxy/ProxyAgent for resource type Proxy successfully started at Tue 10 Aug 2010 04:54:05 AM MVT 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure finished successfully after failing to complete within the expected time for (2) consecutive times. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (5) consecutive times. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (3) consecutive times. 2010/09/11 10:23:07 VCS WARNING V-16-1-10023 Agent LVMLogicalVolume not sending alive messages since Sat 11 Sep 2010 10:20:52 AM MVT 2010/09/11 10:23:07 VCS NOTICE V-16-1-53026 Agent LVMLogicalVolume ipm connection still valid 2010/09/11 10:23:07 VCS NOTICE V-16-1-53027 Waiting one more try for ipm connection for agent LVMLogicalVolume to go away 2010/09/11 10:23:18 VCS WARNING V-16-1-10023 Agent LVMLogicalVolume not sending alive messages since Sat 11 Sep 2010 10:20:52 AM MVT 2010/09/11 10:23:18 VCS NOTICE V-16-1-53026 Agent LVMLogicalVolume ipm connection still valid 2010/09/11 10:23:18 VCS NOTICE V-16-1-53030 Termination request sent to LVMLogicalVolume agent process with pid 18032 2010/09/11 10:23:18 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/LVMLogicalVolume/LVMLogicalVolumeAgent for resource type LVMLogicalVolume successfully started at Sat 11 Sep 2010 10:23:18 AM MVT 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuDbData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuOracleLog_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) [root@mdsu1a ~]# [root@mdsu1a ~]# [root@mdsu1a ~]# grep "2010/09/11" /var/VRTSvcs/log/engine_A.log 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS ERROR V-16-2-13027 (mdsu1a) Resource(mdsuData_lv) - monitor procedure did not complete within the expected time. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure finished successfully after failing to complete within the expected time for (2) consecutive times. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (5) consecutive times. 2010/09/11 10:20:52 VCS INFO V-16-2-13026 (mdsu1a) Resource(mdsuData_lv) - monitor procedure finished successfully after failing to complete within the expected time for (3) consecutive times. 2010/09/11 10:23:07 VCS WARNING V-16-1-10023 Agent LVMLogicalVolume not sending alive messages since Sat 11 Sep 2010 10:20:52 AM MVT 2010/09/11 10:23:07 VCS NOTICE V-16-1-53026 Agent LVMLogicalVolume ipm connection still valid 2010/09/11 10:23:07 VCS NOTICE V-16-1-53027 Waiting one more try for ipm connection for agent LVMLogicalVolume to go away 2010/09/11 10:23:18 VCS WARNING V-16-1-10023 Agent LVMLogicalVolume not sending alive messages since Sat 11 Sep 2010 10:20:52 AM MVT 2010/09/11 10:23:18 VCS NOTICE V-16-1-53026 Agent LVMLogicalVolume ipm connection still valid 2010/09/11 10:23:18 VCS NOTICE V-16-1-53030 Termination request sent to LVMLogicalVolume agent process with pid 18032 2010/09/11 10:23:18 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/LVMLogicalVolume/LVMLogicalVolumeAgent for resource type LVMLogicalVolume successfully started at Sat 11 Sep 2010 10:23:18 AM MVT 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuDbData_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) 2010/09/11 10:23:18 VCS INFO V-16-1-10306 Resource mdsuOracleLog_lv (Owner: unknown, Group: APPDB) is offline on mdsu1a (Previous State = OFFLINE) [root@mdsu1a ~]# hastatus -sum -- SYSTEM STATE -- System State Frozen A mdsu1a RUNNING 0 A mdsu1b RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B APPDB mdsu1a Y N OFFLINE|FAULTED B APPDB mdsu1b Y N ONLINE B Daemons mdsu1a Y N ONLINE B Daemons mdsu1b Y N ONLINE -- RESOURCES FAILED -- Group Type Resource System C APPDB Mount Mount_data_backup mdsu1a [root@mdsu1a ~]# [root@mdsu1a ~]# hastatus attempting to connect.... attempting to connect....connected group resource system message --------------- -------------------- -------------------- -------------------- mdsu1a RUNNING mdsu1b RUNNING APPDB mdsu1a *FAULTED* OFFLINE APPDB mdsu1b ONLINE ------------------------------------------------------------------------- Daemons mdsu1a ONLINE Daemons mdsu1b ONLINE Onguard mdsu1a OFFLINE Onguard mdsu1b ONLINE EdnVIP mdsu1a OFFLINE ------------------------------------------------------------------------- EdnVIP mdsu1b ONLINE mdsuData_lv mdsu1a OFFLINE mdsuData_lv mdsu1b ONLINE mdsuDbData_lv mdsu1a OFFLINE mdsuDbData_lv mdsu1b ONLINE ------------------------------------------------------------------------- mdsuOracleLog_lv mdsu1a OFFLINE mdsuOracleLog_lv mdsu1b ONLINE Mount_data_backup mdsu1a *FAULTED* Mount_data_backup mdsu1b ONLINE Mount_data_oracle mdsu1a OFFLINE ------------------------------------------------------------------------- Mount_data_oracle mdsu1b ONLINE Mount_data_oracle_log mdsu1a OFFLINE Mount_data_oracle_log mdsu1b ONLINE listener mdsu1a OFFLINE listener mdsu1b ONLINE ------------------------------------------------------------------------- Notifier mdsu1a OFFLINE Notifier mdsu1b ONLINE Oracle_APPDB mdsu1a OFFLINE Oracle_APPDB mdsu1b ONLINE EdnProxy mdsu1a ONLINE ------------------------------------------------------------------------- EdnProxy mdsu1b ONLINE DashboardAgent mdsu1a ONLINE DashboardAgent mdsu1b ONLINE DashboardSystemMonitor mdsu1a ONLINE DashboardSystemMonitor mdsu1b ONLINE ------------------------------------------------------------------------- EdnNIC mdsu1a ONLINE EdnNIC mdsu1b ONLINE [root@mdsu1a ~]# [root@mdsu1a ~]# date Sun Sep 12 20:20:50 MVT 2010 [root@mdsu1a ~]# [root@mdsu1a ~]# [root@mdsu1a ~]# [root@mdsu1a ~]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 21827172 4714800 15983004 23% / /dev/sda7 5743520 131220 5315236 3% /tmp /dev/sda6 11494428 131228 10768544 2% /dd /dev/sda3 21827204 1114400 19583436 6% /var /dev/sda1 233336 16009 205184 8% /boot tmpfs 6236460 0 6236460 0% /dev/shm [root@mdsu1a ~]# sar 3 3 Linux 2.6.18-53.el5PAE (mdsu1a) 09/12/2010 08:21:08 PM CPU %user %nice %system %iowait %steal %idle 08:21:11 PM all 0.00 0.00 0.00 0.00 0.00 100.00 08:21:14 PM all 0.00 0.00 0.00 0.08 0.00 99.92 08:21:17 PM all 0.04 0.00 0.08 0.00 0.00 99.88 Average: all 0.01 0.00 0.03 0.03 0.00 99.93 [root@mdsu1a ~]# vmstat 3 3 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 10835856 342332 1037004 0 0 0 2 0 0 0 0 100 0 0 0 0 0 10837096 342332 1037004 0 0 0 0 1181 4080 0 0 100 0 0 0 0 0 10837220 342332 1037004 0 0 0 45 1183 4258 0 0 100 0 0 [root@mdsu1a ~]# [root@mdsu1a ~]# ps -ef |grep -i ora |wc -l 2 [root@mdsu1a ~]# ps -ef |grep -i ora root 16333 1 0 Jun22 ? 00:00:23 /opt/VRTSagents/ha/bin/Oracle/OracleAgent -type Oracle -agdir /opt/VRTSagents/ha/bin/Oracle root 24935 24750 0 20:22 pts/0 00:00:00 grep -i ora [root@mdsu1a ~]# ps -ef |grep -i back |wc -l 1 [root@mdsu1a ~]# ps -ef |grep -i sql |wc -l 1 [root@mdsu1a ~]# hagrp -clear APPDB -sys mdsu1a [root@mdsu1a ~]# hastatus -sum -- SYSTEM STATE -- System State Frozen A mdsu1a RUNNING 0 A mdsu1b RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B APPDB mdsu1a Y N OFFLINE B APPDB mdsu1b Y N ONLINE B Daemons mdsu1a Y N ONLINE B Daemons mdsu1b Y N ONLINE [root@mdsu1a ~]# date Sun Sep 12 20:23:28 MVT 2010 [root@mdsu1a ~]# Messages: Sep 11 01:09:19 mdsu1b aide: MD5 : Dcu2RsyxdFUE3AlCdquzcw== , fHKzTA5ikqdbmmAT+hOcBQ== Sep 11 01:09:19 mdsu1b aide: File: /sbin/pp_inq Sep 11 01:09:19 mdsu1b aide: Mtime : 1970-01-01 05:00:00 , 2008-08-01 01:38:04 Sep 11 01:09:19 mdsu1b aide: MD5 : jgwIzrqs0bx+aP/arEj03Q== , JvzqN4mnBpuq/QVTyq6kfg== Sep 11 01:09:19 mdsu1b aide: File: /sbin/tc Sep 11 01:09:19 mdsu1b aide: Mtime : 1970-01-01 05:00:00 , 2007-01-09 11:54:45 Sep 11 01:09:19 mdsu1b aide: MD5 : FB8NNBgM8Pt4A2nlcN6Yhg== , iVfCt94Ztbn6MH+0dOK3RA== Sep 11 10:28:00 mdsu1b Had[3940]: VCS ERROR V-16-1-13027 (mdsu1a) Resource(mdsuOracleLog_lv) - monitor procedure did not complete within the e xpected time. Sep 11 10:28:00 mdsu1b Had[3940]: VCS ERROR V-16-1-13027 (mdsu1a) Resource(mdsuDbData_lv) - monitor procedure did not complete within the expe cted time. Sep 11 10:28:00 mdsu1b Had[3940]: VCS ERROR V-16-1-13027 (mdsu1a) Resource(mdsuData_lv) - monitor procedure did not complete within the expect ed time. Sep 12 01:00:01 mdsu1b aide: File /bin in databases has different attributes, 61,317 ]0;root@mdsu1b:~[root@mdsu1b ~]# hastatus -sum -- SYSTEM STATE -- System State Frozen A mdsu1a RUNNING 0 A mdsu1b RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B APPDB mdsu1a Y N OFFLINE B APPDB mdsu1b Y N ONLINE B Daemons mdsu1a Y N ONLINE B Daemons mdsu1b Y N ONLINE [r