07-28-2014 01:37 AM
Hello all
Currently we are using Disks (Disktype BasicDisk) as our primary storage for NetBackup. One or more BasicDisk are grouped together and used in Backup jobs, the unit selection is Round Robin. On each of the storage units we defined a low and high watermark, and also a maximum of concurrent jobs. Each of this setting works well, but on one storage the high watermark, currently set to 90% is ignored and this disk gets filled almost up to 100%. The % full is recognized correctly, but still the high water mark is ignored. The second Disk in this group is still below the low water mark, so there is still space left to write backups to.
Did anyone saw this before? The strange thing is that this is recognized on every other storage, but not on this one.
Windows Server 2008 R2, NetBackup 7.6
Thanks in advance for any insight
Christian
Solved! Go to Solution.
07-28-2014 05:07 AM
.... no staging is configured ....
That is your problem.
High and Low water marks for Basic Disk only work when staging is configured and duplications have been successful.
There are 2 factors that determine the lifespan of images on disk:
Have you had a look at TECH66149 yet?
07-28-2014 11:51 AM
in my opinion, High water mark works with the basic disk but not to expire the images(unless it is in DSS) but its to stop the allocations on this storage unit from the EMM server.
on other words.. EMM server will not allocate the new jobs to the DSU which is reached the high water mark value.
but already allocated jobs (when STU was below high water mark) will contuine to write and can fill up the DSU 100%
07-28-2014 02:31 AM
what is the size of the basic disk storage unit?
and when it cross the high water mark does EMM server is alocating this disk to new backup jobs?, or the 10%(90% to 100%) is being used by the jobs that are alloacted and active when the utilization is less than 90% ?
what about the size of the backup that are running? even after its reaches 90%?
07-28-2014 02:39 AM
Hello
You should lower that value to something far below 90%. NetBackup will only start duplicating images when it reaches the 90%. It will duplicate and remove images until the low water mark is reached. The problem in most cases is that when 90% is reached, there is not sufficient space left to take the current backup set, and then you get disk full errors.
Are these storage units the same size?
Please see this previous post.
https://www-secure.symantec.com/connect/forums/netbackup-job-does-not-wait-disk-full-condition-processing
07-28-2014 03:51 AM
Sorry... must disagree with Riaan....
NBU will duplicate as per the staging schedule in the DSU configuration.
When high water mark is reached, oldest images that have be duplicated will be expired/cleaned out until LWM is reached.
See:
Disk Staging Relocation Behavior: http://www.symantec.com/docs/TECH44719
Disk Staging Cleanup Behavior: http://www.symantec.com/docs/TECH66149
To troubleshoot DSSU issues, check that the following log folders exist on media server: admin, bptm, bpdm.
On master: admin.
07-28-2014 04:15 AM
there are two units in this group, one with ~14 TB, one with ~26 TB, so not the same size. Both of them are set to low / high water mark of 80% / 90%, and no staging is configured as files located on this backup storage are short-term backups only.
As far as I can tell it still alocates backup to this disk, 90% of 14 TB equals to something like 12.6 TB, but according to our monitoring once this is reached it still gets allocations of backups. The largest backup files are > 800 GB, so they are split in one 500GB and one smaller file.
Should I still enable admin, bptm and bpdm logs to troubleshoot low / high water mark?
07-28-2014 04:18 AM
PS: here's an image where you can see when the high water mark is crossed (Sunday ~9am)
07-28-2014 05:07 AM
.... no staging is configured ....
That is your problem.
High and Low water marks for Basic Disk only work when staging is configured and duplications have been successful.
There are 2 factors that determine the lifespan of images on disk:
Have you had a look at TECH66149 yet?
07-28-2014 05:08 AM
Apologies, duplication happens but the removal / cleanup doesn't until it start until it hits the 90%.
So your comment
"As far as I can tell it still alocates backup to this disk, 90% of 14 TB equals to something like 12.6 TB, but according to our monitoring once this is reached it still gets allocations of backups. The largest backup files are > 800 GB, so they are split in one 500GB and one smaller file."
It will not stop sending backups to the storage unit once it reaches 90%. It continues but at the stage it will start to try and make space available. It will stop making space available when it hits the LOW mark.
So you'll see this is maybe a bit late as you only have 1.4TB left on the disk. If you're backup for the day is 2.0 TB you have a problem. If you lower this to 80% you'll have 2.8TB, 70% you'll have 4.2 TB available at the time when the clean up would start. After which, depending on your LOW water mark you might have double that. You can work it out.
If you don't have staging then you're going to hit a full condition. Maybe you should consider creating an advanced disk pool where you can utilize both sets of disks in the same pool. Or split the policies between the storage units.
07-28-2014 11:51 AM
in my opinion, High water mark works with the basic disk but not to expire the images(unless it is in DSS) but its to stop the allocations on this storage unit from the EMM server.
on other words.. EMM server will not allocate the new jobs to the DSU which is reached the high water mark value.
but already allocated jobs (when STU was below high water mark) will contuine to write and can fill up the DSU 100%
07-29-2014 12:00 AM
Marianne, Nagalla
Many thanks for your comments, it makes sense that already allocated jobs will complete even if it's above high water mark, so storage unit gets filled above high water mark. I'll will look into advanced disk pool and staging to resolve this.
07-29-2014 12:07 AM
I think Nagalla has summed it up very well.
HWM indeed applies to Basic Disk, but only as far as initial assignment of jobs are concerned.
If disk is 89% full when job starts, it will not stop when 90% HWM is reached during the backup and no cleanup can happen because no staging/duplication has been configured and backup continues to write to 100% full.
So, yes, you need to add staging for cleanup to work when HWM is reached.