Forum Discussion

N_J's avatar
N_J
Level 3
6 years ago

online/archive backup of DB2 backing up through netbackup failing after upgrade from 7.6.0.2 to 8.1

Hi All, I have a DB2 redhat linux 2.6 client running SAP on it whose online and archive backups were running fine on version 7.6.0.2 before upgrading to 8.1. There are 4 DB2 databases on this client and after upgrade, 2 are running fine and other 2 are failing with EC 130. Below is the error from job details: ----------------------------------------------- 27 juil. 2018 09:39:39 - Error bprd (pid=6411) Unable to write progress log  on client ******. Policy=DB2_*****_ONLINE Sched=Default-Application-Backup 27 juil. 2018 09:39:39 - Error bprd (pid=6411) CLIENT ****** POLICY DB2_*****_ONLINE  SCHED Default-Application-Backup  EXIT STATUS 130 (system error occurred) ------------------------------------------------- Since 2 DBs on this client are writing fine, there is no issues with communication on master server (Also a Redhat Linux Machine 2.6 NB Version 8.1). The problem happened earlier as well when we tried upgrading the client to 7.7.3 but we then rolled back to 7.6.0.2. I have tried adding the progress log path to BPCD_WHITELIST_PATH under bp.conf of the client but still no luck. I would be happy to provide further logs.

7 Replies

  • N_J

    The issue seems to be with the 'progress log' :

     Error bprd (pid=6411) Unable to write progress log  on client

    Is there something different about the progress logs for these 2 DBs ?
    Have you tried to compare scripts with those of the working DBs? 
    Have you checked bprd log on the master to see if there is more info w.r.t. to the log path? 

    • N_J's avatar
      N_J
      Level 3

      Marianne Thanks for your time for looking into it.

      I don't find anything wrong with the script. Morever, the backups were fine before the upgrade.. I am attaching the scripts for your reference and bpdb2 logs for working db and non working db for your reference.

      PS: bprd logs were not building up on master and I guess I have to restart the services on master which I cannot do it now (unless you suggest if there is any other option so that it starts generating).

      • Marianne's avatar
        Marianne
        Level 6

        Hi N_J

        I had a look at the bpdb2 logs - there are 3 - 
        db2bid.073018_00001.txt
        db2d20.073018_00001.txt
        log.073018.txt

        Neither of these logs contain complete info for a backup from beginning to end. 

        There is no reference whatsoever to a backup in db2d20.073018_00001.txt.

        The other 2 logs contain errors and the name of user_ops log files, but the beginning of the backup and progress of the backup do not appear in these logs. 
        If the backups started the previous day, we need those as well. 

        log.073018.txt  starts with an error shortly after midnight - we can only assume that the backup started the previous evening?

        Error: 
        00:21:25.230 [28002] <16> readCommFile: ERR - timed out after 3600 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283
        00:21:25.231 [28002] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283>
        00:21:25.231 [28002] <16> CreateNewImage: ERR - serverResponse() failed
        00:21:25.231 [28002] <16> VxBSACreateObject: ERR - Could not create new image with file /DB2/AUDBOE14/LOGFILE/node0000/db2bid/C0000000_S0020131.LOG.

        It seems as if the backup failed on an archive log file : 
        /DB2/AUDBOE14/LOGFILE/node0000/db2bid/C0000000_S0020131.LOG

        What is worrying is the timeout after 1 hour (3600 sec).
        Can you see what is logged in /usr/openv/netbackup/logs/user_ops/dbext/logs/28002.0.1532899283 ?
        And find the  bpdb2 logs for the previous day to see what happened since the beginning of the backup?
        Especially in the hour before 00:21. 

        Another timeout error for another log a while later: 
        00:54:29.610 [6764] <16> readCommFile: ERR - timed out after 3600 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/6764.0.1532901267
        00:54:29.610 [6764] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/6764.0.1532901267>
        00:54:29.610 [6764] <16> CreateNewImage: ERR - serverResponse() failed
        00:54:29.610 [6764] <16> VxBSACreateObject: ERR - Could not create new image with file /DB2/BOE14/LOGFILE/node0000/db2bid/C0000000_S0033833.LOG.

        There are more user_ops logs listed here along with backup failure for DB2 archive logs.

        The fact that the backup seems to fail on archive logs poses another question :

        Is there a difference between archive log methods for working and non-working backups?

        Does the profile that is loaded in the non-working script (/db2/BID/sqllib/db2profile), perhaps reference a different db2.conf file with different archive log method? 

        PS: 
        It will probably be best to log a Support call with Veritas and work with a knowledgeable support engineer to track down the issue. 

        You will need all of the logs listed in NBU for DB2 manual (including bprd on the master and bpbrm and bptm on the master server. 
        Support will ask for high level logs - they have the time and the tools to sift through these massive logs.

  • Please compare the permissions on the /usr/openv/netbackup/logs/user_ops/ with the client that is working fine.

    With 8.1 you will also have to check if client has been issued a certificate for secure communication.

     

    • N_J's avatar
      N_J
      Level 3

      Tape_Archived 

      Actually, as I said, the 2 DBs on the same client is working fine, and 2 aren't. Also, the permissions are 777 to the user_ops, and its sub directories.

      The DBs that are getting successful are writing the progress log well in the folder but the other 2 DBs doesn't.

      Since, 2 DBs are writing fine, this doesn't look like a secure connection issue.

      Since it started happening after upgrading from 7.6.0.2 to 8.1, I checked what has changed but couldn't find anything except the WHITELIST that I mentioned. 

      • Marianne's avatar
        Marianne
        Level 6

        Have you tried to check bprd log on the master server?
        The log file name and path that is having the issue is normally documented in bprd log. 

        About the scripts - I was wondering if there was something different about the 2 non-working DBs. 
        Maybe some other path is specified for progress logs. That is why I asked if the scripts could be compared.