cancel
Showing results for 
Search instead for 
Did you mean: 

Multiple Clients Failing - VMware - status 196

mhaif
Level 2

Hi All, 

I am pretty new to Veritas Netbackup systems and I have been dealing with a status 196 on a  client environment that has now been happening for a month. This issue is resulting in multiple policies failing due to a backup window timeout error. 

The policies failing are VMWare policies and are configured to use the vsphere intellingent policy backup method using annotations. I raised a veritas support case and the TSE based on his investigation advised to reduce the number of backup jobs from each policy running at any given time. 

However this atleast for now can not be done due to the client's requirement of not having backups run during business hours. 

Currently the backups are set to full backups and are backing up sizes raning from 4TB, 2TB, 1TB. 

I was wondering if any assistance or knowledge share can be done here to help troubleshoot and diagnose the issue at hand? 

Happy to provid emore clear and concise information for further understanding. 

Many Thanks!

4 REPLIES 4

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@mhaif 

Status 196 normally means that backups were queued, but there were no resources that became available before the backup window closed.

1st step is to look at the reason for queued job. Can you share that please?
How many jobs are going active at a time?
How many concurrent jobs are allowed by the Storage unit config?
What type of backup storage?
What is the backup throughput that you are seeing in Activity Monitor?
One or more than one media server?

Similar problem posted some time ago:
https://vox.veritas.com/t5/NetBackup/Backups-failing-due-a-196-error/m-p/869566

 

Hi Marianne, 

Thankyou for your reply.  I was on call with a Veritas TSE last week and apparently the issue was under Master Server host properties - resource limits to SNapshots, Datatstores were set to very low. (4 to be accurate) we have now increased these values to accomodate 25. I/O set to 120 and policy job limits were same as before. 

We made the change to accomodatethe amount of backup jobs running on the environment and this has helped significantly. Currently looking at slow backup speeds and I am considering looking at the Datastores and the speeds between them and teh NBU host. 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You are in a catch-22 situation.
I understand that you need to activate all VM backup jobs within the backup window, but overloading a single datastore with too many simultaneous backups will slow down performance.

See https://www.veritas.com/content/support/en_US/doc/21902280-127283730-0/v19546127-127283730
VMware recommends that you run no more than four simultaneous backups of virtual machines that reside on the same datastore.

Have a discussion with VMware admin to agree on Resource Limits that best fits this environment.

What type of backup storage in this environment?
Do you have dedupe storage where Accellerator and Change Block Tracking can be used to speed up backups?

I would strongly recommend reducing your datastore number back down to 4 and incrementing it upwards by 1-2 at a time. As Marianne suggested, work with your VMWare team to find a good number, as overloading the datastore can bring your array down, and then EVERYBODY is going to be very unhappy. 

196's are not backup "failures", but failures to backup (if that makes sense). There is no 1 simple solution to rectifying them, as they are usually indicative of architectural and configuration issues. 

Do you see some of the backups from the policy completing successfully, then the last jobs to kick off finishing with status 196, or does the entire policy fail for 196 after sitting queued? How many jobs are running at one time? Are the snapshots completing successfully and backups hang, or do the snapshots just not initiate as well? Are all of your VM policies kicking off simultaneously, or are they staggered? Lots of variables to rectify this situation, such as I/O limitations, network constraints, policy and server config, etc. 

NBU can handle 1000s of VM backups in a 12-16 hour "off-hour" setup, so you should be able to accomodate the request of only backing up during non-production hours.