cancel
Showing results for 
Search instead for 
Did you mean: 

Not receiving email alerts when CPU usage threshold is breaching on one node but receiving for the other node

paragarw
Level 4

Hi Guys,

 

I have a two node cluster with same configurion for both the nodes. But the email alerts are received only for one node.

Yesterday CPU usage exceeded on the other node but i didnt get any alerts. What can be the issue.

The configuration shown by hasys -display is same on both the nodes.

 

VCS Warning for System abc, CPU Usage exceeded the threshold on the system

Event Time: Sat Oct 17 22:10:29 2015Entity Name: abc Entity Type: System Entity Subtype: m: x86_64, sys: Linux, r: 2.6.18-308.el5 Entity State: CPU Usage exceeded the threshold on the system Traps Origin: Veritas_Cluster_Server System Name: abcEntities Container Name: dsa-cluster Entities Container Type: VCS

 

See the configuration below for problamatic node. Nothing is captured in engine logs either

 

abc   AgentsStopped      0
abc   AvailableCapacity  100
abc   CPUThresholdLevel  Critical   90     Warning       80     Note   70     Info   60
abc   CPUUsage           0
abc   CPUUsageMonitoring Enabled    0      ActionThreshold      0      ActionTimeLimit      0       Action NONE   NotifyThreshold      0      NotifyTimeLimit      0
abc   Capacity           100
abc   ConfigBlockCount   285
abc   ConfigCheckSum     55879
abc   ConfigDiskState    CURRENT
abc   ConfigFile         /etc/VRTSvcs/conf/config
abc   ConfigInfoCnt      0
abc   ConfigModDate      Mon 19 Oct 2015 01:52:56 PM CDT
abc   ConnectorState     Down
abc   CurrentLimits      
abc   DiskHbStatus       
abc   DynamicLoad        0
abc   EngineRestarted    0
abc   EngineVersion      5.1.10.0
abc   FencingWeight      0
abc   Frozen             0
abc   GUIIPAddr          
abc   HostUtilization    CPU 1      Swap   5
abc   LLTNodeId          1
abc   LicenseType        PERMANENT_SITE
abc   Limits             
abc   LinkHbStatus       eth0 UP     eth6   UP
abc   LoadTimeCounter    0
abc   LoadTimeThreshold  600
abc   LoadWarningLevel   80
abc   NoAutoDisable      0
abc   NodeId             1
abc   OnGrpCnt           3
abc   ShutdownTimeout    600
abc   SourceFile         ./main.cf
abc   SwapThresholdLevel Critical   90     Warning       80     Note   70     Info   60
abc   SysInfo            Linux:abc,#1 SMP Fri Jan 27 17:17:51
abc   SysName            abc
abc   SysState           RUNNING
abc   SystemLocation     
abc   SystemOwner        
abc   TFrozen            0
abc   TRSE               0
abc   UpDownState        Up
abc   UserInt            0
abc   UserStr            
abc   VCSFeatures        DR
abc   VCSMode            VCS

 

 

1 REPLY 1

sudhir_h
Level 4
Employee

Hi,

 

Were you receiveing alerts by email before from the problematic node and now it has stopped all of a sudden?

Have you checked any firewall rules blocking outgoing packets to SMTP port? Also try restarting the notifier agent on the problematic node and see if that solves the problem.

 

Kindly enable debug level for the notifier agent when you perform any operation to see if it reports any issues.

 

Regards,

Sudhir