cancel
Showing results for 
Search instead for 
Did you mean: 

NBU Catalog takes long to complete

Maverick101
Level 4

Good day,

We noticed that our catalog backup (incr) included takes about 2 hours to complete. We run a full every morning at 6am. Our catalog is around 230GB with incremental that runs every 6 hours after that. The size of the inc is around 9gb in size.

The full backup takes just over 80 mins to complete on average while the inc takes about 100mins to complete. The issue is that during this time our tlog jobs go into a queued state and cannot run adn we have 5 mins tlog jobs as well. 

From the job details in can see that the data processes but then just hangs on image validation.

On the media servers i can see that bptm/bprm completes and that on the master server bpbkar32 is still running. 

Thanks in advance. 

 

 

7 REPLIES 7

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Do you have bpbkar log folder on the master server? 
Please increase logging level on the master to 3, then check bpbkar log after the next incremental backup. 

If you want us to have a look at bpbkar log, please copy the log to bpbkar.txt and upload as attachment, along with job details. 

You should not have any jobs going into queued state because of this - unless there is a resource issue. 

Please share text in Job details of queued job. 

I have the required folders and have set the log level to 3. 

Yes, we dont have resource issues as thats how i noticed the issue, that we would have no job running besides a few replication jobs and then all our tlogs jobs would go into a queued state with very little activity on the master server. 

I then noticed on the Catalog job details when the Image validation runs all jobs go into a queued state 

16 Apr 2020 4:19:34 PM - Info bptm (pid=10248) waited for full buffer 4572 times, delayed 13726 times
16 Apr 2020 4:19:37 PM - Info bptm (pid=10248) EXITING with status 0 <----------
16 Apr 2020 5:08:08 PM - Info bpbrm (pid=14796) validating image for client CBNBMGT1001
16 Apr 2020 5:33:40 PM - Info bpbkar32 (pid=1084) done. status: 0: the requested operation was successfully completed
16 Apr 2020 5:33:40 PM - end writing; write time: 1:17:47
the requested operation was successfully completed  (0)

Now this validation takes on average i have seen about 30-40 minutes. I will share the bpbkar32 log soon as well. 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Can you show us job details of the queued job? 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
16 Apr 2020 5:08:08 PM - Info bpbrm (pid=14796) validating image for client CBNBMGT1001
16 Apr 2020 5:33:40 PM - Info bpbkar32 (pid=1084) done. status: 0: the requested operation was succe


I suggest you look at level 3 bpbrm log as well to see what is happening in both processes.  

Queued Job

17 Apr 2020 10:49:00 AM - Info nbjm (pid=6440) starting backup job (jobid=11767492) for client CBBSP1002, policy BLIS_PRD_E012_04_TL, schedule Tlogs
17 Apr 2020 10:49:00 AM - Info nbjm (pid=6440) requesting MEDIA_SERVER_WITH_ATTRIBUTES resources from RB for backup job (jobid=11767492, request id:{AF30E357-1FB9-4753-8AC8-C4906F680960})
17 Apr 2020 10:49:00 AM - requesting resource  CBDDPRDPA1001_TLogsT1
17 Apr 2020 10:49:00 AM - requesting resource  cbnbmgt1001.NBU_CLIENT.MAXJOBS.CBBSP1002
17 Apr 2020 10:49:00 AM - granted resource  cbnbmgt1001.NBU_CLIENT.MAXJOBS.CBBSP1002
17 Apr 2020 10:49:00 AM - granted resource  CBDDPRDPA1001_TLogsT1
17 Apr 2020 11:18:52 AM - estimated 0 kbytes needed
17 Apr 2020 11:18:52 AM - Info nbjm (pid=6440) started backup (backupid=CBBSP1002_1587115132) job for client CBBSP1002, policy BLIS_PRD_E012_04_TL, schedule Tlogs on storage unit CBDDPRDPA1001_TLogsT1
17 Apr 2020 11:18:53 AM - started process bpbrm (pid=10324)
17 Apr 2020 11:18:53 AM - Info bpbrm (pid=10324) CBBSP1002 is the host to backup data from
17 Apr 2020 11:18:53 AM - Info bpbrm (pid=10324) reading file list for client
17 Apr 2020 11:18:54 AM - Info bpbrm (pid=10324) starting bphdb on client
17 Apr 2020 11:18:54 AM - Info bphdb (pid=9952) Backup started
17 Apr 2020 11:18:54 AM - Info bphdb (pid=9952) The script <D:\Apps\Veritas\NetBackup\DbExt\MsSql\E012_Tlog.bch> has been approved to run and is stored locally.
17 Apr 2020 11:18:54 AM - connecting
17 Apr 2020 11:18:54 AM - connected; connect time: 0:00:00
17 Apr 2020 11:18:57 AM - Info dbclient (pid=36352) INF - Skipping backup of availability group database <E012BAND> per configuration. The database is not hosted on the preferred backup replica.
17 Apr 2020 11:18:58 AM - Info dbclient (pid=36352) INF - OPERATION #1 of batch D:\Apps\Veritas\NetBackup\DbExt\MsSql\E012_Tlog.bch SUCCEEDED with STATUS 0 (0 is normal). Elapsed time = 2(0) seconds
17 Apr 2020 11:19:00 AM - Info dbclient (pid=36352) INF - Results of executing <D:\Apps\Veritas\NetBackup\DbExt\MsSql\E012_Tlog.bch>: 
17 Apr 2020 11:19:00 AM - Info dbclient (pid=36352) <1> operations succeeded. <0> operations failed.
17 Apr 2020 11:19:00 AM - Info bpbrm (pid=10324) validating image for client CBBSP1002
17 Apr 2020 11:19:01 AM - Info bphdb (pid=9952) done. status: 0: the requested operation was successfully completed
17 Apr 2020 11:19:01 AM - end writing
the requested operation was successfully completed  (0)

------------------------------------------------------------------

Catalog Job

17 Apr 2020 10:08:48 AM - Info nbjm (pid=6440) starting backup job (jobid=11766485) for client CBNBMGT1001, policy BLIS_Hot_Catalog, schedule Differential-Inc
17 Apr 2020 10:08:48 AM - Info nbjm (pid=6440) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=11766485, request id:{CDB083A3-4E49-4E03-A4AE-EEAF5E9D8CB8})
17 Apr 2020 10:08:48 AM - requesting resource  CBDDPRDPA1001_NBU_Catalog
17 Apr 2020 10:08:48 AM - requesting resource  cbnbmgt1001.NBU_CLIENT.MAXJOBS.CBNBMGT1001
17 Apr 2020 10:08:48 AM - requesting resource  cbnbmgt1001.NBU_POLICY.MAXJOBS.BLIS_Hot_Catalog
17 Apr 2020 10:08:49 AM - granted resource  cbnbmgt1001.NBU_CLIENT.MAXJOBS.CBNBMGT1001
17 Apr 2020 10:08:49 AM - granted resource  cbnbmgt1001.NBU_POLICY.MAXJOBS.BLIS_Hot_Catalog
17 Apr 2020 10:08:49 AM - granted resource  MediaID=@aaaaf;DiskVolume=NBU_Catalog;DiskPool=CBDDPRDPA1001_NBU_Catalog;Path=NBU_Catalog;StorageServer=CBDDPRDPA1001;MediaServer=CBNBP1002
17 Apr 2020 10:08:49 AM - granted resource  CBDDPRDPA1001_NBU_Catalog
17 Apr 2020 10:11:24 AM - estimated 0 kbytes needed
17 Apr 2020 10:11:24 AM - Info nbjm (pid=6440) started backup (backupid=CBNBMGT1001_1587111084) job for client CBNBMGT1001, policy BLIS_Hot_Catalog, schedule Differential-Inc on storage unit CBDDPRDPA1001_NBU_Catalog
17 Apr 2020 10:11:25 AM - Info bpbrm (pid=10764) CBNBMGT1001 is the host to backup data from
17 Apr 2020 10:11:25 AM - Info bpbrm (pid=10764) reading file list for client
17 Apr 2020 10:11:25 AM - started process bpbrm (pid=10764)
17 Apr 2020 10:11:25 AM - connecting
17 Apr 2020 10:11:26 AM - Info bpbrm (pid=10764) starting bpbkar32 on client
17 Apr 2020 10:11:26 AM - connected; connect time: 0:00:00
17 Apr 2020 10:11:27 AM - Info bpbkar32 (pid=8628) Backup started
17 Apr 2020 10:11:27 AM - Info bpbkar32 (pid=8628) change time comparison:<disabled>
17 Apr 2020 10:11:27 AM - Info bpbkar32 (pid=8628) archive bit processing:<enabled>
17 Apr 2020 10:11:27 AM - Info bptm (pid=16016) start
17 Apr 2020 10:11:27 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\vault\sessions\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\vault\sessions'
17 Apr 2020 10:11:27 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\customer_registration\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\customer_registration'
17 Apr 2020 10:11:27 AM - Info bptm (pid=16016) using 787456 data buffer size
17 Apr 2020 10:11:27 AM - Info bptm (pid=16016) setting receive network buffer to 3150848 bytes
17 Apr 2020 10:11:27 AM - Info bptm (pid=16016) using 512 data buffers
17 Apr 2020 10:11:27 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\nbstserv\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\nbstserv'
17 Apr 2020 10:11:27 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\reports\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\reports'
17 Apr 2020 10:11:29 AM - Info bptm (pid=16016) start backup
17 Apr 2020 10:11:30 AM - Info bptm (pid=16016) backup child process is pid 16144.8428
17 Apr 2020 10:11:30 AM - Info bptm (pid=16144) start
17 Apr 2020 10:11:30 AM - begin writing
17 Apr 2020 10:11:45 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\telemetry\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\telemetry'
17 Apr 2020 10:12:01 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\wmc\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\wmc'
17 Apr 2020 10:12:01 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\global\wsl\config\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\global\wsl\config'
17 Apr 2020 10:12:01 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\VxSS\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\VxSS'
17 Apr 2020 10:12:01 AM - Warning bpbrm (pid=10764) from client CBNBMGT1001: WRN - old TIR info file 'D:\Apps\Veritas\NetBackup\tir_info\D\Apps\Veritas\NetBackup\var\VxUpdate\NetBackup_file_info.BLIS_Hot_Catalog' is missing. Backing up everything in 'D:\Apps\Veritas\NetBackup\var\VxUpdate'
17 Apr 2020 10:12:02 AM - Info bpbrm (pid=10764) from client CBNBMGT1001: TRV - last message being suppressed after 10 occurrences
17 Apr 2020 10:13:51 AM - Info bpbkar32 (pid=8628) 43981 entries sent to bpdbm
17 Apr 2020 10:13:59 AM - Info bptm (pid=16016) waited for full buffer 2943 times, delayed 8958 times
17 Apr 2020 10:14:02 AM - Info bptm (pid=16016) EXITING with status 0 <----------
17 Apr 2020 10:46:59 AM - Info bpbrm (pid=10764) validating image for client CBNBMGT1001
17 Apr 2020 11:18:51 AM - Info bpbkar32 (pid=8628) done. status: 0: the requested operation was successfully completed
17 Apr 2020 11:18:51 AM - end writing; write time: 1:07:21
the requested operation was successfully completed  (0)

 

I would like to add the bkbkar32 log as an attachment but i do not see any option for it. Please advise as its quite large.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Looking at the client backup job, I don't see job being queued, buth certainly a delay between resource allocation and job only starting after catalog job has finished at 11:18:51 :

17 Apr 2020 10:49:00 AM - granted resource  CBDDPRDPA1001_TLogsT1
17 Apr 2020 11:18:52 AM - estimated 0 kbytes needed
17 Apr 2020 11:18:52 AM - Info nbjm (pid=6440) started backup (backupid=CBBSP1002_1587115132) job for client CBBSP1002, policy BLIS_PRD_E012_04_TL, schedule Tlogs on storage unit CBDDPRDPA1001_TLogsT1

Looking at the above, my suspision is now with the resource broker (nbrb) on the master server or with OS resources, like memory exhaustion.

We have seen in older NBU versions (6.5.x and early 7.x) where there was recommended tuning for nbrb, but I believe those are no longer valid for current versions. 

Have you perhaps checked resource utilization on the master while the catalog backup is running? 

Here is a 10-year-old post for reference (please don't try any of the suggestions or click on any of the links):
https://vox.veritas.com/t5/NetBackup/Many-Jobs-delayed-quot-granting-resources-quot-and-quot/td-p/29...

I found a TN that was recently updated, but it still references old NBU versions. 
https://www.veritas.com/support/en_US/article.100025235

What could be handy in above TN are the references to nbjm and nbrb logs - seems those are going to be most relevant.
Please do not try to view these logs as regular text - use vxlogview to display user-readable format. 

I want to further suggest that you log a cal with Veritas Support to assist with troubleshooting. 
My guess is that they will request all of the legacy logs that we have mentioned here as well as unified logs like nbjm, nbrb and probably nbpem too. 

In view of the Job details that you posted, I do not see much value in bpbkar log at this stage. 
Just for the record, it is possible to zip large log files and use the 'Choose Files' link below a reply to upload files.