cancel
Showing results for 
Search instead for 
Did you mean: 

online/archive backup of DB2 backing up through netbackup failing after upgrade from 7.6.0.2 to 8.1

N_J
Level 3
Hi All, I have a DB2 redhat linux 2.6 client running SAP on it whose online and archive backups were running fine on version 7.6.0.2 before upgrading to 8.1. There are 4 DB2 databases on this client and after upgrade, 2 are running fine and other 2 are failing with EC 130. Below is the error from job details: ----------------------------------------------- 27 juil. 2018 09:39:39 - Error bprd (pid=6411) Unable to write progress log  on client ******. Policy=DB2_*****_ONLINE Sched=Default-Application-Backup 27 juil. 2018 09:39:39 - Error bprd (pid=6411) CLIENT ****** POLICY DB2_*****_ONLINE  SCHED Default-Application-Backup  EXIT STATUS 130 (system error occurred) ------------------------------------------------- Since 2 DBs on this client are writing fine, there is no issues with communication on master server (Also a Redhat Linux Machine 2.6 NB Version 8.1). The problem happened earlier as well when we tried upgrading the client to 7.7.3 but we then rolled back to 7.6.0.2. I have tried adding the progress log path to BPCD_WHITELIST_PATH under bp.conf of the client but still no luck. I would be happy to provide further logs.
7 REPLIES 7

Tape_Archived
Moderator
Moderator
   VIP   

Please compare the permissions on the /usr/openv/netbackup/logs/user_ops/ with the client that is working fine.

With 8.1 you will also have to check if client has been issued a certificate for secure communication.

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@N_J

The issue seems to be with the 'progress log' :

 Error bprd (pid=6411) Unable to write progress log  on client

Is there something different about the progress logs for these 2 DBs ?
Have you tried to compare scripts with those of the working DBs? 
Have you checked bprd log on the master to see if there is more info w.r.t. to the log path? 

@Tape_Archived 

Actually, as I said, the 2 DBs on the same client is working fine, and 2 aren't. Also, the permissions are 777 to the user_ops, and its sub directories.

The DBs that are getting successful are writing the progress log well in the folder but the other 2 DBs doesn't.

Since, 2 DBs are writing fine, this doesn't look like a secure connection issue.

Since it started happening after upgrading from 7.6.0.2 to 8.1, I checked what has changed but couldn't find anything except the WHITELIST that I mentioned. 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Have you tried to check bprd log on the master server?
The log file name and path that is having the issue is normally documented in bprd log. 

About the scripts - I was wondering if there was something different about the 2 non-working DBs. 
Maybe some other path is specified for progress logs. That is why I asked if the scripts could be compared. 

@Marianne Thanks for your time for looking into it.

I don't find anything wrong with the script. Morever, the backups were fine before the upgrade.. I am attaching the scripts for your reference and bpdb2 logs for working db and non working db for your reference.

PS: bprd logs were not building up on master and I guess I have to restart the services on master which I cannot do it now (unless you suggest if there is any other option so that it starts generating).

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Yes - bprd needs to be restarted to enable logging. 

About the scripts - I can see that there are no output files specified in the backup scripts. 

The only difference seems to be the Database name and User.
And the fact that the non-working one loads a profile before the backup command is started:

. /db2/BID/sqllib/db2profile;db2 BACKUP DATABASE $MY_DB2 $MY_SCHED online LOAD $MY_LIB BUFFER 1024

Any idea why? (I realize that this was working on 7.6, but we need to look at this as if it is a new issue.)
What is listed in this db2profile that the other user for working DB does not need? 

My suggestion: 

Find a time to restart NBU so that we can have bprd logs.

Log on to the client as user db2bid. 
Manually run the commands in CMD_LINE and watch the output.

 

I will have a look at the logs a bit later in the day (if time permits...)

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi @N_J

I had a look at the bpdb2 logs - there are 3 - 
db2bid.073018_00001.txt
db2d20.073018_00001.txt
log.073018.txt

Neither of these logs contain complete info for a backup from beginning to end. 

There is no reference whatsoever to a backup in db2d20.073018_00001.txt.

The other 2 logs contain errors and the name of user_ops log files, but the beginning of the backup and progress of the backup do not appear in these logs. 
If the backups started the previous day, we need those as well. 

log.073018.txt  starts with an error shortly after midnight - we can only assume that the backup started the previous evening?

Error: 
00:21:25.230 [28002] <16> readCommFile: ERR - timed out after 3600 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283
00:21:25.231 [28002] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283>
00:21:25.231 [28002] <16> CreateNewImage: ERR - serverResponse() failed
00:21:25.231 [28002] <16> VxBSACreateObject: ERR - Could not create new image with file /DB2/AUDBOE14/LOGFILE/node0000/db2bid/C0000000_S0020131.LOG.

It seems as if the backup failed on an archive log file : 
/DB2/AUDBOE14/LOGFILE/node0000/db2bid/C0000000_S0020131.LOG

What is worrying is the timeout after 1 hour (3600 sec).
Can you see what is logged in /usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283 ?
And find the  bpdb2 logs for the previous day to see what happened since the beginning of the backup?
Especially in the hour before 00:21. 

Another timeout error for another log a while later: 
00:54:29.610 [6764] <16> readCommFile: ERR - timed out after 3600 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/6764.0.1532901267
00:54:29.610 [6764] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/6764.0.1532901267>
00:54:29.610 [6764] <16> CreateNewImage: ERR - serverResponse() failed
00:54:29.610 [6764] <16> VxBSACreateObject: ERR - Could not create new image with file /DB2/BOE14/LOGFILE/node0000/db2bid/C0000000_S0033833.LOG.

There are more user_ops logs listed here along with backup failure for DB2 archive logs.

The fact that the backup seems to fail on archive logs poses another question :

Is there a difference between archive log methods for working and non-working backups?

Does the profile that is loaded in the non-working script (/db2/BID/sqllib/db2profile), perhaps reference a different db2.conf file with different archive log method? 

PS: 
It will probably be best to log a Support call with Veritas and work with a knowledgeable support engineer to track down the issue. 

You will need all of the logs listed in NBU for DB2 manual (including bprd on the master and bpbrm and bptm on the master server. 
Support will ask for high level logs - they have the time and the tools to sift through these massive logs.