04-26-2009 10:19 PM
Hi All,
I want you to share the "Best Practices" for doing a "health-check" of an environment.
I monitor jobs through Veritas 6.5 console daily and I want some simple/basic systematic steps which I can go through (like checking elapsed time, checking kbps, checking drives etc) which will help me easily identify problems (if any) in the environment.
I sometimes have problem in easily idenitifying problems like "Hung Backup", "Hung Environment", "Slow Job" etc.
Also, is there an easier way to "identify jobs that haven't triggerred automatically" in an environment without manual intervention?
Your expert advices and best tips will help me more efficiently monitor the environment of servers and effectively diagnose problems within servers in timely fashion!
Thanks in advance.
Chitra
Solved! Go to Solution.
04-27-2009 03:37 PM
04-26-2009 10:37 PM
04-26-2009 11:20 PM
04-26-2009 11:42 PM
04-26-2009 11:50 PM
04-26-2009 11:59 PM
04-27-2009 03:37 PM
05-28-2009 09:31 AM
05-28-2009 01:03 PM
While others will focus on the "put out the fire and rerun", i am in the ideology of repairing for good across the board and making everything run smoothly. Depending on what rights you have, its important to check a few things:
1) Do you have alot of re-queing or failed jobs?
--- If you have alot of re-queing or failed jobs, its important to find out what common things might be occuring. Is there a down drive? Is there a mismapped drive? Is there a ghost tape in the robot? Does the media server and master server need a recycle of services?
2) How long are the jobs being queued?
-----Alot of database jobs will fail or seem hung if they waited a long time to get access to drives. It's important that database policies have the highest priority since they need to finish to get back into active mode. Flat file backups generally don't follow as high of precedence.
3) How long are the backups running? Is this causing delays for others?
----Sometimes you need to take a look at what is getting backed up, if its the /tmp directory with a ton of small files, theres no need to have it get backed up. This is delaying you finishing. Also you can see on the KB/sec throughput to see how the performance is. Most 100 full duplex connections will write at roughly 8,000KB/sec to 10,000 KB/sec if the network health and everything is setup correctly. Gigabit connections can write significantly more (i've seen 55,000KB/sec and more). Making sure that the network switches and network card ports are set to their highest value and at full duplex with no auto negotiate can resolve alot of performance issues on backups. A ton of small files significantly slows down backups because it has to write to teh header, read the tiny file, and then write it, then stop its writing pattern and do it again for the next one. This can be potentially looked at in a "can we zip and archive them at a certain point" or is this more appropriate for a VCB or flashbackup?
There are tons upon tons of recommendations you can get on how to react and improve your environment. It all depends on your current circumstances and your availability.
05-28-2009 11:47 PM
05-29-2009 12:29 AM
07-24-2009 06:34 AM
Not sure how to ask this question...
I cant see anything in activity monitor. On master and media servers when launching activity monitor.
any ideas? im running 6.5.2A
Any help is appreciated.
Thanks
ABT
07-24-2009 07:03 AM