cancel
Showing results for 
Search instead for 
Did you mean: 

How I can read bpcd and bpbkar logs for basic troubleshoot .

RaghavAneja
Level 4

Hi ,

give some khowlege for bacis backup troubleshoot with the help of logs files . 

BR

Raghav Aneja 

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited

You are asking an almost impossible question - it depends what the problem is, why do you specifcally mention bpcd and bpbkar ?

The simple answer is that for legacy logs (in ...netbackup/logs dir or volmgr/debug dir are single threaded, so although there might be multiple pids the number in the square brackets [xxx] ) this PID will show what the process was doing for a particular activity.  For a log such as bptm, considering a backup you will see the same PID in the log where bptm gets the resources from nbjm, mounts the tape, runs the backup etc ...

The numbers in <xx> show the severity of the line, so <2> is informational, whereas <16> or <32> are generaly errors, with <32> being the most serious.

vx logs are more difficult to read, because they are multithreaded.  From example, if we look in say the bptm log (not a vx log and single threaded) you can read it like a book.

 
[1234] xxxx
[1234] xxxx
[1234] xxxx
[1244] yyyy
[1244] yyyy
[1244] yyyy
 
Where xxxx and yyyy are lines relating to a particular 'activity'
 
So if I search all the lines containing a given 'PID' (shown in [xxx] ) then those lines relate to a single activity, eg. mounting a tape.
 
With VX logs the PID (process ID ) is shown, and TID (Thread ID)
 
nbemm 111 PID:20454 TID:6 
 
The problem, is that if xxx and yyy are lines relating to a particular activity ...
 
nbemm 111 PID:20454 TID:6  xxxx
nbemm 111 PID:20454 TID:6  yyyy
nbemm 111 PID:20454 TID:7  xxxx
 
One second, TID 6 could be dealing with the 'activity', the next moment, TID 6 is dealing with some other activity (yyyy) and TID 7 has taken over 'xxxx'.
 
That means if I search for all the TID 6 lines, I get multiple lines returned that could have nothing to do with each other,
 
Unfortunately, NBU is too complex for any one person to know exactly what the logs lines will be for every possibly thing NBU can do ... therefore the only way to read these logs:
 
1/ You need to know what NBU will do next and to be able to find the next line in the sequence.
 
or
 
2/  Run the job on a working machine and compare the logs line by line to spot the differencies.
 
 
Generally, I use both 1/ and 2/ when reading these logs.  For something 'simple' like a backup, I know the main steps thet appear in the logs, so I canfind one line and then look for the next.
 
For example, NBPEM will start a backup ,and then submit the job to nbjm
 
So, I find the line in nbpem that shows the job starting, and I then know that the next line I have to look for is where nbpem gos to contact nbjm, which may, or may not be the same TID).
 
But...
 
If I have an issue I have never seen before, I have no idea what the logs should be showing, I then usually run the job on a working system to get the logs, and then compare these to the non-working logs.
 
With legacy logs (bptm etc ...) we don't have this issue, you can just spearate the logs out by searching for the individual PIDs.

View solution in original post

2 REPLIES 2

Marianne
Level 6
Partner    VIP    Accredited Certified

Have a look at my post in this discussion:

How to use logs 

Also extremely important that you understand the NBU process flow diagram in order to pinpoint faulting process on master, media server or client.

Short version of process flow diagram over here: NetBackup 7.x Process Flow 

Detailed info in Appendix A of Troubleshooting Guide.
Here is the link to NBU 7.6 version: NetBackup Troubleshooting Guide 
 

mph999
Level 6
Employee Accredited

You are asking an almost impossible question - it depends what the problem is, why do you specifcally mention bpcd and bpbkar ?

The simple answer is that for legacy logs (in ...netbackup/logs dir or volmgr/debug dir are single threaded, so although there might be multiple pids the number in the square brackets [xxx] ) this PID will show what the process was doing for a particular activity.  For a log such as bptm, considering a backup you will see the same PID in the log where bptm gets the resources from nbjm, mounts the tape, runs the backup etc ...

The numbers in <xx> show the severity of the line, so <2> is informational, whereas <16> or <32> are generaly errors, with <32> being the most serious.

vx logs are more difficult to read, because they are multithreaded.  From example, if we look in say the bptm log (not a vx log and single threaded) you can read it like a book.

 
[1234] xxxx
[1234] xxxx
[1234] xxxx
[1244] yyyy
[1244] yyyy
[1244] yyyy
 
Where xxxx and yyyy are lines relating to a particular 'activity'
 
So if I search all the lines containing a given 'PID' (shown in [xxx] ) then those lines relate to a single activity, eg. mounting a tape.
 
With VX logs the PID (process ID ) is shown, and TID (Thread ID)
 
nbemm 111 PID:20454 TID:6 
 
The problem, is that if xxx and yyy are lines relating to a particular activity ...
 
nbemm 111 PID:20454 TID:6  xxxx
nbemm 111 PID:20454 TID:6  yyyy
nbemm 111 PID:20454 TID:7  xxxx
 
One second, TID 6 could be dealing with the 'activity', the next moment, TID 6 is dealing with some other activity (yyyy) and TID 7 has taken over 'xxxx'.
 
That means if I search for all the TID 6 lines, I get multiple lines returned that could have nothing to do with each other,
 
Unfortunately, NBU is too complex for any one person to know exactly what the logs lines will be for every possibly thing NBU can do ... therefore the only way to read these logs:
 
1/ You need to know what NBU will do next and to be able to find the next line in the sequence.
 
or
 
2/  Run the job on a working machine and compare the logs line by line to spot the differencies.
 
 
Generally, I use both 1/ and 2/ when reading these logs.  For something 'simple' like a backup, I know the main steps thet appear in the logs, so I canfind one line and then look for the next.
 
For example, NBPEM will start a backup ,and then submit the job to nbjm
 
So, I find the line in nbpem that shows the job starting, and I then know that the next line I have to look for is where nbpem gos to contact nbjm, which may, or may not be the same TID).
 
But...
 
If I have an issue I have never seen before, I have no idea what the logs should be showing, I then usually run the job on a working system to get the logs, and then compare these to the non-working logs.
 
With legacy logs (bptm etc ...) we don't have this issue, you can just spearate the logs out by searching for the individual PIDs.