Storage space is getting filled up and no space is...

ShaunyaChavis · ‎04-18-2009

Problem/Symptoms

The storage locations are filling up with blob files. None of the space is ever being reclaimed.
- or -
Blobs are not shrinking after computer accounts are being marked for deletion

Environment

NS 6.0.6074 SP3
RS 6.2.2106
RS 6.2.2332
RS 6.2.2760

Cause

There can be many causes for this.

Server Space Management job was not scheduled to run.
There are old accounts that need to be cleaned up in “Manage Lost Recovery Solution Clients”.
Unnecessary files are being backed up because the exclude list has not been customized to best fit the end users needs.

Resolution

Here are some storage space optimization practices for reclaiming used space and helping to prevent the storage volume from filling up.

Enable the SSM, IC, Delete job once a week. The Server Space Management job does run a delete job when it finishes, however, if this job does not finish, space will not be reclaimed without the Delete job scheduled independently. Jobs can be enabled in the “Configuration>Solution Settings>Incident Management>Recovery Solution>Recovery Solution Clusters>Recovery Solution Cluster Configuration” and then click on your cluster. The settings are on the “Server Jobs Schedule” sub tab. If you are using RS 6.2.2332 or later, also make sure that the “Force deletion of the recently excluded files” should be checked under the Server Space Management job.
Make sure the Storage Space’s default rules are enabled and have an appropriate time frame in the “Delete snapshots older than” and “Delete files from the Recovery Server” settings. This is listed under “Configuration>Solution Settings>Incident Management>Recovery Solution>Recovery Agent Settings>Default Recovery Agent Settings” under the “Space Management” sub tab.
Use the “File Extensions Backed Up by Cluster” report under “Reports>incident Management>Recovery Solution>Server Reports>File Extensions Backed Up by Cluster” to streamline the exclude list. You can add the extensions that use the most space to the excludes to maximize space return.
Note: This report takes a great deal of SQL cycles and may timeout in large database or blob environments.
Note: A common top ten space user are *.log files. If *.log is added to the excludes to reclaim space, add WinFAL.log to the exceptions to this exclude. The data in this log file is used by the RS in the creation of some FSR data (see KB 41066)
Use the “Cluster Disk Space Currently Used” report under “Reports>incident Management>Recovery Solution>Server Reports>Cluster Disk Space Currently Used” to discover computer that are using the most space. There may be computers in this list that are no longer in the network or are not a priority to back up. To mark these accounts for deletion, right click on the computer and choose “Recovery Solution tasks>Administration>Mark Computer Account for Deletion…” Note: This report takes a great deal of SQL cycles and may timeout in large database or blob environments.
Discovery why machines are in the “Manage Lost Recovery Solution Clients” and clean up the list. Go to “Configuration>Solution Settings>Incident Management>Recovery Solution>Recovery Agent Settings>Manage Lost Recovery Solution Clients.” KB article 34389, Why are my computers in the “Manage Lost Recovery Solution Clients”, discusses some common causes.
Set up the NS Purging Maintenance to delete inactive machines after 60 days instead of retiring them. This will also help to reclaim licenses that are no longer in use. “Purging Maintenance” is found under “Configuration>Server Settings>Notification Server Settings>Purging Maintenance”.
Make sure there are not duplicate machines in the NS or RS. There are a few articles about finding and merging duplicate computers in the NS: 2093, 17630, 22745, 31766, and more.
Before you run out of storage space, set a faults limit that is less than the size of the storage. Don't run out of space if you can help it. Snapshots, Server jobs take space and are less efficient if the storage space is near or at 100%. If you have a fault limit, you get a heads up that you have a storage problem before you are completely out of space. If you can reserve 5% or 10% of space, you will be able to give the RS some of the reserved room while you are working on freeing up space.
If there are machines which do not need backed up as much as others, it is a good idea to prioritize machines by deleting the less essential machine accounts. It is better to continue the backups of mission critical machines rather than running out of space by machines that an OS can simply be reinstalled on and the user can continue with work as usual.
Make sure you have the latest related hotfixes installed such as KB 32726 “HotFix #4: Recovery Solution 6.2.2332 - Duplicate blobs are created in the same storage group” .
If the previous steps are unable to free up space, the space is being legitimately used and more storage will need to be added.
If there are specific files that need to be deleted from the server, they can be marked for deletion at the cluster level using the cluster's right click menu option “Recovery Solution Tasks>Administration>Mark Files for Deletion...”. The cluster is found in “Configuration>Solution Settings>Incident Management>Recovery Solution>Recovery Solution Clusters>Recovery Solution Cluster Configuration”. Remember that if the file still exsists on the client, it will again be backed up during the next snapshot. This option is similar in function to the “Force deletion of the recently excluded files” that was added in RS 6.2.2332 but can be found in previous versions of RS.
It is a recommendation to have the SQL database stored on a different IDE channel than the blobs. Size of the database file can cause server jobs to run for days. There are related articles about keeping the database size under control like KB 29130. The transaction log of the database can also be a size and performance problem which is discussed in KBs 32005 and 1068.
Check for unreferenced blob files by comparing the count of blobs in the storage volume to the results of this query:
Use AeXRSDatabase Select count(*) from DiskFile.
This is discussed in a draft article: 35861.

Note: Remember that you will not see disk space freed up after making setting changes until the next Delete job has completed successfully and in some cases, the delete job must have been proceeded with the affected clients having updated their NS policy configuration.

VOX

Storage space is getting filled up and no space is being reclaimed by any of the Recovery Solution jobs