System heavily Loaded
Environment
HA Nodes = 2
Product = SFHA
Version 6.1.1
ERROR / WARNING
June 12 18:30:00 NODE4 kernel: LLT INFO V-14-1-10035 timer not called for 3235 ticks
June 12 18:30:00 NODE4 kernel: LLT INFO V-14-1-10205 link 0 (eth3) node 0 in trouble
Understanding
As per my understanding either HA node is heavily loaded or Network has delays. Furthermore HostMonitor log under /var/VRTSvcs/log shows high memory alerts like below
2016/xx/xx 18:30:01 VCS INFO V-16-10061-14064 HostMonitor:VCShm:monitor:Updating System attribute with Mem usage = 99%.
It seems evident that due to lack of hardware resources the above mentioned errors/warning occuring.
Contradiction
In parallel we already setup a crontab job which is redirecting SAR and free -m command output in a text file.on every couple of minutes, which doesnot show any hike in CPU & MEMORY. Means SAR result shows almost 95% idle and free -m command shows 80% free out of 10GB memory.
Query
- Could not understand why errors/warning happening if system resources are almost idle.
- Why HostMonitor log showing wrong result. As HostMonitor shows high memory and free -m command shows enough available memory.