12-04-2013 12:23 PM
RHEL 5.9, 64 bit Master Server. Windows 2008 Standard 64 bit Media Servers. NBU 7.5.0.6 all around.
1. Where can I get good explanations of the kind of information I can expect to find in a given legacy log?
2. Where can I find detailed information about where and what log files to retrieve when troubleshooting a problem.
i.e. For problem "A" look in logs x, y and z on your client, a and b on the Media server and c on the Master. (I understand that I will have to make determinations of the kind of entries I'm looking for given "problem "A"" but once I've determined what kinds of log entries I'm after I need to know what log files to look in.)
3. What log files should only exist / relavent on a client, on a MS or on a Master server, 2 of the three or all three.
I've read through NBU manuals and guides and searched the web but haven't been able to find this kind of specific information only what process / daemon a log file belongs to and if lucky some very brief explanation (four to ten words). I believe the info must be there but I'm spinning an lot of time trying to dig it out myself.
This has frustrated me for some time and what has prompted this post is that I am trying to definitively determine what the interfaces IPs and host names are used between client and MS plus between client and Master for specific backups. So help with this in the short term would be valuable.
Terry
Solved! Go to Solution.
12-04-2013 01:37 PM
These are pretty much impossible questions
1. Where can I get good explanations of the kind of information I can expect to find in a given legacy log?
It doesn't exist - NBU is too complex to document like ths. The best advice is to know which processes communicate with each other (see troubleshooting guide for process flows) and then when a failure happens you know (hopefuly) where abouts to start looking. Activity monitor often shows which process failed. Sometimes this can be misleading eg. backups fail with status 6, but the real status code will be somewhere else ... So an error can happen in one log, and then filter through others, but only one of the later failures is reported in act monitor. It can be a game of seek and ye will hopfully find ...
2. Where can I find detailed information about where and what log files to retrieve when troubleshooting a problem.
See answer above - it doesn't exsist in any offical documentation. Again use process flows to see what talks to what as a starting point. If I'm stuck, I usually enable everything on test machnine, run the job and then simpy see which logs have changed. It's not perfect as some logs (eg bpdbm) are constantly chaging but it's another way to get some idea, and this genearlly doesn't work on the vx logs as they constantly channge (well, the common ones do).
3. What log files should only exist / relavent on a client, on a MS or on a Master server, 2 of the three or all three.
It depends which processes you want to monitor.
On a master I would keep nbemm, nbjm, nbrb, nbpem, bpdbm and bprd as minimum. However bpdbm / jm /rb/ emm get very large at high verbose levels so it's usually not possible to leave them running.
Then again, you could add in mds (which logs into emm) .. but then again, this isn't always needed and makes the log bigger. vx logs 156 and 137 are often needed for comms type issues (also on media) but it would be impossible to turn these up for long on a busy system as they log into many of the other logs and make them massive (I'm talking GBs of logs)
Then if you use other features, eg. SLP then a load of other logs come into play,
For SLP I could list about 18 logs that you could enable to monitor SLPs across the master and media - but depending on where the problem is, you might only need maybe 3 or 4, so as you see, this is an impossible question to answer.
media - bpdm, bptm, bpbrm tpcommand ltid roobots
client - tar , bpbkar, bpcd
Again, certainly for the media server you could add in a load of vx logs depending on what features you use.
Unfortunately the answer really is just experience - I'd advise to tackle one issue at a time, and then make a note of which logs you needed . Hopefully the details above will help a little.
12-04-2013 03:45 PM
12-04-2013 08:58 PM
My 2c....
The most important part of troubleshooting and creating logs, is to understand the process flow. This can be found in Appendix A of the Troubleshooting Guide (along with definition of processes that you found).
If we know that bpbrm on the media server connects to bpcd on the client at the start of the backup, then we understand that these are the logs where we will find IP addresses used by media server and client for connection and backup. The bpcd log will show us how client resolves media server IP to hostname and compares with Server entries. So, these 2 logs are needed when we troubleshoot connection issues.
If we know that bpbkar on client sends data to bptm on the media server and metadata to bpbrm on the media server, we understand that these are the logs that we need to troubleshoot backups that fail after data transfer has started.
If we know that all restore requests is firstly sent to bprd on the master, then we understand that troubleshooting of restores starts with this log.
Once restore parameters have been confirmed with bpdbm on the master, bprd hands the restore instruction to bpbrm on the media server, from where connection is made to bpcd on the client, which starts tar process and bpbrm in the meantime starts bptm to read data and sends data to tar on the client. So, now we know which logs we need to troubleshoot restores.
Here is how I use logs: https://www-secure.symantec.com/connect/forums/how-use-logs#comment-6505511
Process flow in pdf format: NetBackup 7.x Process Flow
12-04-2013 01:12 PM
llegacy ogs present at location /usr/openv/netbackup/logs . location is same either master , media and client server.
u need to understand where exactly problem is, at what end real problem is.
suppose if client A backup runs then at a certain KB it fails with error code 13. then u need not to look at bpcd logs on client. u need to look at bpbkar logs.
suppose if most windows server backup fails with some error code and they all have same media server then problem is at media server. then search bpbrm logs on media server.
there are various directories u need to create loke bpjobd, bpdbm etc
search logs at specific time intervals then look for <16> or <32>
12-04-2013 01:27 PM
DOCUMENTATION: A comprehensive list of NetBackup (tm) 6.0 directories and commands relating to Unified Logging
http://www.symantec.com/docs/TECH43251
DOCUMENTATION: Best Practice recommendations for enabling and gathering NetBackup (tm) logging
http://www.symantec.com/docs/TECH47372
HOWTO: Obtain NetBackup legacy log entries specific to a particular job and process id.
12-04-2013 01:29 PM
Thank you for getting back rookie11. Please refer to the above.
"(I understand that I will have to make determinations of the kind of entries I'm looking for given "problem "A"" but once I've determined what kinds of log entries I'm after I need to know what log files to look in.)"
I know I have to do the troubleshooting for best guess of what might have gone wrong but what I need to know is what kind of information can I expect to find in "bpbkar" or "bpdm" or "bpcd" and so on so that I can match up my troubleshooting with which logs to look in.
Similar to the example I gave at the bottom of my post. What logs have content that are going to provide me with IP / host name connection information? Do I look in "bpresolver", "bprd" or what? And where do I look for the right log file(s)? On the Master, MS or client or some combination?
I'm not after a post to lay out all this information. Rather I'm looking for direction on documentation that contains all this information in a complete form rather then what I found in the Netbackup Troubleshooting Guide PDF. See the excerpt below.
bpbrm NetBackup backup and restore manager.
bpcd NetBackup client daemon or manager. The NetBackup Client service starts this process
bpdbjobs NetBackup jobs database manager program.
12-04-2013 01:32 PM
DOCUMENTATION: A comprehensive list of NetBackup (tm) 6.0 directories and commands relating to Unified Logging
http://www.symantec.com/docs/TECH43251
12-04-2013 01:37 PM
These are pretty much impossible questions
1. Where can I get good explanations of the kind of information I can expect to find in a given legacy log?
It doesn't exist - NBU is too complex to document like ths. The best advice is to know which processes communicate with each other (see troubleshooting guide for process flows) and then when a failure happens you know (hopefuly) where abouts to start looking. Activity monitor often shows which process failed. Sometimes this can be misleading eg. backups fail with status 6, but the real status code will be somewhere else ... So an error can happen in one log, and then filter through others, but only one of the later failures is reported in act monitor. It can be a game of seek and ye will hopfully find ...
2. Where can I find detailed information about where and what log files to retrieve when troubleshooting a problem.
See answer above - it doesn't exsist in any offical documentation. Again use process flows to see what talks to what as a starting point. If I'm stuck, I usually enable everything on test machnine, run the job and then simpy see which logs have changed. It's not perfect as some logs (eg bpdbm) are constantly chaging but it's another way to get some idea, and this genearlly doesn't work on the vx logs as they constantly channge (well, the common ones do).
3. What log files should only exist / relavent on a client, on a MS or on a Master server, 2 of the three or all three.
It depends which processes you want to monitor.
On a master I would keep nbemm, nbjm, nbrb, nbpem, bpdbm and bprd as minimum. However bpdbm / jm /rb/ emm get very large at high verbose levels so it's usually not possible to leave them running.
Then again, you could add in mds (which logs into emm) .. but then again, this isn't always needed and makes the log bigger. vx logs 156 and 137 are often needed for comms type issues (also on media) but it would be impossible to turn these up for long on a busy system as they log into many of the other logs and make them massive (I'm talking GBs of logs)
Then if you use other features, eg. SLP then a load of other logs come into play,
For SLP I could list about 18 logs that you could enable to monitor SLPs across the master and media - but depending on where the problem is, you might only need maybe 3 or 4, so as you see, this is an impossible question to answer.
media - bpdm, bptm, bpbrm tpcommand ltid roobots
client - tar , bpbkar, bpcd
Again, certainly for the media server you could add in a load of vx logs depending on what features you use.
Unfortunately the answer really is just experience - I'd advise to tackle one issue at a time, and then make a note of which logs you needed . Hopefully the details above will help a little.
12-04-2013 01:41 PM
Thank you Stumpr2.
Some time ago, at first glance I got excited about unified logging until I tried to use it and discovered why it didn't seem to be in popular favor of the NBU admins. Even with Symantec support I have only been directed to use it once by one tech.
I've been through the other documents (at least partially, there's so much to wade through I realize I could easily miss what I'm after and each has links to several more documents and that's the kind of searching I've been doing). They seem to be more about setting up logging, creating the directories and managing them but not so much about what kind of content is in each log file.
12-04-2013 01:43 PM
Status 13 is not usually bpcd log. If it happens mid backup it is more likely to be bpbkar / bpbrm for example - with a root cause of something network related.
There is no real way to say status code xx IS this log.
Not all erros are <16> or <32> - though this is a very good place to start, but some import log lines can be listed as low as <2>
.
12-04-2013 01:52 PM
Thank you mph999. I like your humility, just under a thousand I apprecieate the time you put into your answer. I suspect that you are right but I wanted to put it out to the community anyway. Every once in awhile luck floats.
So isn't my search for which log files to find the connection information between client / MS and master easier? Can you help me with just that one piece then? I want to verify by log file that my clients are using the backup interface between client and MS and what they are using to talk to the Master (Master is not a MS).
thx
Terry
12-04-2013 03:45 PM
12-04-2013 03:55 PM
12-04-2013 03:55 PM
12-04-2013 08:58 PM
My 2c....
The most important part of troubleshooting and creating logs, is to understand the process flow. This can be found in Appendix A of the Troubleshooting Guide (along with definition of processes that you found).
If we know that bpbrm on the media server connects to bpcd on the client at the start of the backup, then we understand that these are the logs where we will find IP addresses used by media server and client for connection and backup. The bpcd log will show us how client resolves media server IP to hostname and compares with Server entries. So, these 2 logs are needed when we troubleshoot connection issues.
If we know that bpbkar on client sends data to bptm on the media server and metadata to bpbrm on the media server, we understand that these are the logs that we need to troubleshoot backups that fail after data transfer has started.
If we know that all restore requests is firstly sent to bprd on the master, then we understand that troubleshooting of restores starts with this log.
Once restore parameters have been confirmed with bpdbm on the master, bprd hands the restore instruction to bpbrm on the media server, from where connection is made to bpcd on the client, which starts tar process and bpbrm in the meantime starts bptm to read data and sends data to tar on the client. So, now we know which logs we need to troubleshoot restores.
Here is how I use logs: https://www-secure.symantec.com/connect/forums/how-use-logs#comment-6505511
Process flow in pdf format: NetBackup 7.x Process Flow