cancel
Showing results for 
Search instead for 
Did you mean: 

Backups are hung or failed with error code 196 and very less backups giving byte count

Dany123
Not applicable

Hi Team,

Master Server OS is Solaris

Media server OS is Solaris

Netbackup Version on master/media server  is 7.6.0.4

We have one Master server and 2 media server. we have 24 tape drives in our environment and out of 24 tape drives 19 tape drives are working and rest of 5 tape drives are bad, need to change that.

 

we set "maximum concurrent  write drive" to 24 and "maximum stream per drive" to 25.

we set media multiplexing in Policy Schedule to 3.

we checked "Allow Multiple Data stream" in policy attribute for almost all policies.

 

Issue is: we have lots of backup about 800 daily in queued state and got lots of error code 196 daily.

             If we have 120 backups in active state then only 12-15 backup giving byte count .

 

this starts happening from last one week.

 

2 ACCEPTED SOLUTIONS

Accepted Solutions

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

"maximum stream per drive" to 25 is Mulitplexing value and its very high ...never recomended..

what the Mulitplexing values that is set in policy schedules.. Do not set more than 6, better value is 4

Allow Multiple Data stream let you tirgger multiple jobs for each client and reults more processing time and more jobs and more failures... I would suggest you to remove the Allow Multiple Data stream option from policy since you have only 19 tape drives..(24 when faulty one fixed) ,this should reduce the number of jobs...(inturn number of failues)

also you need to check if there is any slow backups, slow clients that are holding the Drives for long time, if you find those need to fix it got get the jobs running fast.

also check the backup window and see if you have increase the backups windows times as per your requirement..

once this is all done if you still see the 196 failues.. then that is time for vefirying the tunining and  capacity planning..

-> howmuch of data that needs by backup every day?

-> howmuch of data Drives can backup every day?

> does the Drives running with optimised speed.. any think can de done to impove the drives/backup speed. etc....

if amount of data is more than the drives can backup with good perfromance..then its time to expand the storage capacity...

 

View solution in original post

RonCaplinger
Level 6

This looks like your first post, so please excuse the assumption that you are new to NetBackup if that is incorrect.

The NetBackup system enviornment is not set up properly and is causing backups to perform slowly, which is causing the error code 196 because the rest of the backup jobs have not been able to complete in a timely manner.  Adding tape drives to an environment does NOT increase how much data can be backed up in the backup window.  There must be an equivalent amount of bandwidth from all parts of the backup environment.

First, Solaris is not the best choice for a media server.  It works fine for the Master, but media servers need higher Input/Output bandwidth that, in my experience, a 64-bit Linux or 64-bit Windows OS can handle better.  Consider changing those media servers to Linux if at all possible.

Second, I know of no media server that can handle enough throughput to send data to more than 2 or 3 tape drives at once.  I would suggest reducing the number of concurrent write drives to just TWO drives per media server.  And as Ram suggested, change the backup policies to not allow multiple streams.  See if the tape drives are able to back up some clients faster than they have been.  If so, then consider adding more media servers, with only 2 or 3 tape drives per media server. If not, then your clients likely cannot stream data fast enough and you should investigate the client side more.

Once you are able to see better throughput, then start allowing multiple data streams in the policy, but no higher than about 4 streams.  It will only hurt performance to increase higher than that. 

And at all times, monitor the master and media server CPU and network I/O speeds, especially during the backup window.  If you are consistently seeing >70% CPU usage, or 100% network bandwidth, those indcate it is time to either add more powerful media servers, or look at disk-to-disk-to-tape backups, like the Symanted NetBackup appliances.

View solution in original post

2 REPLIES 2

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

"maximum stream per drive" to 25 is Mulitplexing value and its very high ...never recomended..

what the Mulitplexing values that is set in policy schedules.. Do not set more than 6, better value is 4

Allow Multiple Data stream let you tirgger multiple jobs for each client and reults more processing time and more jobs and more failures... I would suggest you to remove the Allow Multiple Data stream option from policy since you have only 19 tape drives..(24 when faulty one fixed) ,this should reduce the number of jobs...(inturn number of failues)

also you need to check if there is any slow backups, slow clients that are holding the Drives for long time, if you find those need to fix it got get the jobs running fast.

also check the backup window and see if you have increase the backups windows times as per your requirement..

once this is all done if you still see the 196 failues.. then that is time for vefirying the tunining and  capacity planning..

-> howmuch of data that needs by backup every day?

-> howmuch of data Drives can backup every day?

> does the Drives running with optimised speed.. any think can de done to impove the drives/backup speed. etc....

if amount of data is more than the drives can backup with good perfromance..then its time to expand the storage capacity...

 

RonCaplinger
Level 6

This looks like your first post, so please excuse the assumption that you are new to NetBackup if that is incorrect.

The NetBackup system enviornment is not set up properly and is causing backups to perform slowly, which is causing the error code 196 because the rest of the backup jobs have not been able to complete in a timely manner.  Adding tape drives to an environment does NOT increase how much data can be backed up in the backup window.  There must be an equivalent amount of bandwidth from all parts of the backup environment.

First, Solaris is not the best choice for a media server.  It works fine for the Master, but media servers need higher Input/Output bandwidth that, in my experience, a 64-bit Linux or 64-bit Windows OS can handle better.  Consider changing those media servers to Linux if at all possible.

Second, I know of no media server that can handle enough throughput to send data to more than 2 or 3 tape drives at once.  I would suggest reducing the number of concurrent write drives to just TWO drives per media server.  And as Ram suggested, change the backup policies to not allow multiple streams.  See if the tape drives are able to back up some clients faster than they have been.  If so, then consider adding more media servers, with only 2 or 3 tape drives per media server. If not, then your clients likely cannot stream data fast enough and you should investigate the client side more.

Once you are able to see better throughput, then start allowing multiple data streams in the policy, but no higher than about 4 streams.  It will only hurt performance to increase higher than that. 

And at all times, monitor the master and media server CPU and network I/O speeds, especially during the backup window.  If you are consistently seeing >70% CPU usage, or 100% network bandwidth, those indcate it is time to either add more powerful media servers, or look at disk-to-disk-to-tape backups, like the Symanted NetBackup appliances.