02-08-2013 03:06 AM
Hi All,
We just migrate our Netbackup master server to version 7.5.0.4 running on RHEL 6.1 the configuration also consiss of Veritas GCO Cluster and VVR. The server is running on VMware 5.0.
Nowaday the system keeps on crashing/rebooting sometimes once a day somtimes stable for 2 days and crash again, especially when lots of backup running. Currently I also open a case to RedHat, but anyone has experience on this? Is there any parameter need to change?
Regards,
Iwan
Solved! Go to Solution.
04-07-2013 06:18 PM
Hi All,
Just want to update it.
After we brought up to RedHat, they say something wrong about the
VxVM/VxDMP so we brought up to Symantec. Symantec sugest to do this (this solution may be really a site specific:)
1) Yes, patch installation is not required.
What you need to do is add below lines to the vxattachd script.
========================
Update /usr/lib/vxvm/bin/vxattachd with below.
323 # === Add below check to ignore LVM disks ===
324
325 lvm=`$VXDISK list | grep $dmpnode | awk '{print $5}'`
326 if [ "$lvm" = "LVM" ]
327 then
328 continue
329 fi
Then restart vxattachd by restart vxvm-recover # /etc/init.d/vxvm-recover
restart ========================
No crash happen anymore so far(12 days uptime, usually every 5 days or less the system crash)
02-08-2013 05:28 AM
hi
please provide the logs from.
/usr/openv/db/log/server.log
/var/log/messages
02-08-2013 06:46 AM
Depending on the erros and symptons you are seeing, it could be a space issue or a tunning issue.
1) Space - Make sure you have at 10% free space where netbackup is isntalled at. 7% the bare min. Once it goes below this threshold Netbackup will stop serivces to avoide data corruption.
2) Tuning - Reivew these tech notes :
http://www.symantec.com/docs/DOC4483 -
Symantec NetBackup™ 7.0 - 7.1 Backup Planning and Performance Tuning Guide
http://www.symantec.com/docs/TECH28934
3rd PARTY RELATED ISSUE: Linux kernel tuning recommendations for NetBackup (tm)
/etc/xinetd.conf
cps = 50 10
instances = 50
per_source = 10
hese values have been adjusted in other cases and could limit the number of connections --for example to 50 every 10 seconds and to 50 instances of each service per IP address. Additional information on these parameters are available at: http://linux.die.net/man/5/xinetd.conf
You can increase the number of connections or unlimit per your System Admin.
kernel parameters - sbin/sysctl -a
kernel.sem = 300 32000 32 1024
kernel.msgmnb = 65536
kernel.msgmni = 256
kernel.msgmax = 65536
kernel.shmmni = 4096
kernel.shmall = 4294967296
kernel.shmmax = 68719476736
ulimit -a
ulimit tuning
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1056767
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
02-12-2013 11:16 PM
Have you already get primary cause of reboot? Why your server crashed?
Panic? HW watchdog? Any statistics of system activity and memory usage?
NetBackup itself has no kernel module on RHEL, so I think it is hard for NetBackup to cause panic, stale or so. If reboot is caused by panic, you should collect kernel dump and send it RedHat. They may find system killer.
If you enabled HW watchdog, please try to disable it. watchdog sometimes reset the system while the system is running well.
02-12-2013 11:26 PM
02-24-2013 10:47 PM
Hi All thanks alot for the reply, I will try to look it one by one. I am so sorry for the very late reply because of the migration activity. Currently the system is a bit stabilized but could happen again soon. At the same time I also log to RedHat.
I will let you know the outcome. Thank you Again
iwan
02-25-2013 12:53 AM
Please have a look at this TN that was published a couple of days ago:
http://www.symantec.com/docs/TECH202840
04-07-2013 06:18 PM
Hi All,
Just want to update it.
After we brought up to RedHat, they say something wrong about the
VxVM/VxDMP so we brought up to Symantec. Symantec sugest to do this (this solution may be really a site specific:)
1) Yes, patch installation is not required.
What you need to do is add below lines to the vxattachd script.
========================
Update /usr/lib/vxvm/bin/vxattachd with below.
323 # === Add below check to ignore LVM disks ===
324
325 lvm=`$VXDISK list | grep $dmpnode | awk '{print $5}'`
326 if [ "$lvm" = "LVM" ]
327 then
328 continue
329 fi
Then restart vxattachd by restart vxvm-recover # /etc/init.d/vxvm-recover
restart ========================
No crash happen anymore so far(12 days uptime, usually every 5 days or less the system crash)
04-08-2013 06:16 PM
Thank you for sharing.
BTW, can you post any clue like kernal backtrace by which we can identify this issue?