cancel
Showing results for 
Search instead for 
Did you mean: 

SSR 2011 fails to cleanup .v2i recovery points after a backup

PCJunky
Level 4

We have Symantedc System Recovery 2010/2011 across several sights, and we have noticed a common problem occurs where previous recovery points do not get cleaned up post backup as per the rules in the backup job.

Is there anything else we can do to make this more robust without resorting to post backup scripts deleteing files as this will mess up SSR's recovery point history and run the risk of deleting the last good backup if the current backup fails.

Currently we are aware of only 2 places we can automate this process, in the backup job you can set the amount of recovery points to keep, and in the manage backup destination/settings you can set a threshhold based on capacity.

So for example if I am creating a backup files that is 350 gb (final compressed .v2i files), and I have set "Limit the number of recovery points for this backup" to 1, and set the "manage destination settings" to "monitor disk space usage for the backup storage" and set the threshold to 360 gb, also set to "auto matically optomise storage"

Have I missed something here or is there a better way we should be doing this?

From what I can see in the logs, at the point the process should be purging the previous days .v2i files SSR seems to think the backup device is unavailable (when it fails), however the backup device in this specific example is an Iomega NAS drive on a 1GB lan...

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

PCJunky
Level 4

Incremental fixes this, I guess by working round the main issues of a big single backup - you still have to get that initial snapshot image but once thats done your set and the backups are far more reliable.

Thoughts:

SSR has issues creating large single backup images to a network device, so ideally you want to be using SSR with a locally attached device - although further evidence suggests its more of a speed issue so a large backup to a slow local device say USB1 hard disk will have simliar issues, SSR reporting the network connection dropping is a red herring as seperate diagnostics show no network drop outs at the time SSR says there were, its more likey a timeout issue SSR then interperits as a network dropout.

When it starts to go wrong it goes wrong quite quickly over the space of one maybe two weeks as SSR slowly fills the drive, this is due in the majority to the old backup cleanup never taking place when a jobs crashes/fails, compounded by SSR not properly managing the backup location and being able to inteligently groom it - see http://www.symantec.com/docs/TECH167758 a known issues with SSR not grooming old backups, as it works ok for incremental and only affects single image backups - its another reason to go with incremental.

SSR was meant to be a low cost setup and forget backup solution for our small business customers who have no internal IT skills who wanted single image 'snapshot' backups, and while I cannot sing the praises enough for BE, and both Symantecs telephone phone and forum support is without equal in our experiance, sadley SSR has missed the mark for us.

Here's hopeing SSR 2012 continues to build on this and overcomes these issues...

View solution in original post

42 REPLIES 42

criley
Moderator
Moderator
Employee Accredited

So for example if I am creating a backup files that is 350 gb (final compressed .v2i files), and I have set "Limit the number of recovery points for this backup" to 1, and set the "manage destination settings" to "monitor disk space usage for the backup storage" and set the threshold to 360 gb, also set to "auto matically optomise storage"

I'm not sure this is going to work. Even though you have limited the number of recovery points to 1, you still need enough room for 2. This is because the 2nd recovery point needs to be created before we remove the first.

Also, are you doing incrementals as well or just full (independent backups) each day? If incrementals as well, you obviously need to factor in the approx size of those when you consider setting the threshold for space used.

RS
Level 3

I was testing this today.  Last time, I ended up using an independent recovery point, where I used a script to delete the previous backup files before the backup starts.

Backup Job

  • Folder: C:\Recovery point set
    Using a local folder for testing this time.
  • [X] Limit the number of recovery point sets saved for this backup
    Maximum: 1 (one incremental recovery point - Auto Incremental)
  • Start a new recovery point set (base) - Auto Base:
    Monthly

Backup Destination

  • [X] Monitor disk space usage for backup storage.
    Threshold amount of disk space to use for backup storage:
    20,0 GB
  • [X] Automatically optimize storage.
  • [_] Delay changes until next backup.
    Confirm that recovery points will be automatically cleaned up.

Results after running 4 scheduled backups (changing the schedule, in case there is some special treatment using Run now):

Space currently used for recovery point storage:
40,0 GB

Recall that the threshold is 20,0 GB.

Files in the backup destination:

10 GB - NS_C_Drive001.v2i (Auto Base)
10 GB - NS_C_Drive001_i001.iv2i (Auto Incremental)
10 GB - NS_C_Drive001_i002.iv2i (Auto Incremental)
10 GB - NS_C_Drive001_i003.iv2i (Auto Incremental)

Basically, it doesn't do what it says it should do, because here we have three incrementals, instead of either getting an error message (threshold exceeded), or having at least one incremental.

Version: 10.0.1.41704

PS: I also see "Consolidate incrementals: Never" in the backup job properties, however, I haven't seen the setting for it anywhere, nor in the user guide.

criley
Moderator
Moderator
Employee Accredited

@RS

We have a known issue that is specific to independent backups:

http://www.symantec.com/docs/TECH167758

Backing up C to itself is obviously not good practice. Can you test backing up to a different location to see if you see the same results?

RS
Level 3

@Chris

Noted, however, the post refers to a recovery point set, not an independent recovery point, so it is perhaps broken in both cases.

 

I originally got the same result with removable RDX cartridges, that's why I had to switch to using an independent recovery point, with a custom script to delete the previous set.

 

Update: I changed the backup destination settings to Warn me when backup storage exceeds threshold.

While I did not get a warning on-screen when the backup started, the log file dumps a line with the following errors:

Error EC8F17E5 -> Error E0BB0083 -> Error E7640001 -> EBAB03F1 -> 0x800700E

..saying that there is not enough storage space, and recommends that the user should start the cleanup process.
 

[Cean Up...] is ghosted/disabled, however.

Whether it backs up to the same drive (C) or not; it should follow the rules with the threshold.

PCJunky
Level 4

Thanks for the quick responce, first backup ran successfully last night, something I forgot to add is that in this example its two server backing up to the same location/NAS all be it seperate folders.

\\nas\\sharename\server01

\\nas\\sharename\server02

With sharename being the day of the week so mon/tues/wed/thurs/fri, 1.8 TB formatted capacity on the nas, total of both server backups with standard compression is 350 GB, hence the need for it not to leave any files behind as 5 x 350 is 1.75 TB.

See your point about the folder size so I have upped share file size limit to 700 GB per share, this will allow for two copies but then offers no protection from the problem we started with...

This will now take 6 more days before we have gone full circle and are re writing to a previously used share.

Backup type is full (independent backups) each working day, no backups at the weekend.

Thanks

*NOTE there was only Chris's initial responce when I typed this...

criley
Moderator
Moderator
Employee Accredited

@RS

Yes, I know the original post was regarding recovery point sets. I was just making you both aware of the existing known issue.

Whether it backs up to the same drive (C) or not; it should follow the rules with the threshold.

Whilst I agree with you it's still not good practice and not something that 99.9% of customers will be doing. If you see same issue when backing up to a different location, then obviously we can look at that in more detail.

criley
Moderator
Moderator
Employee Accredited

@PCJunky

Backup type is full (independent backups) each working day, no backups at the weekend.

So it sounds as though you have hit this problem then..?

http://www.symantec.com/docs/TECH167758

PCJunky
Level 4

As I see there are two seperate tools to manage this problem, in the job itself it should remove previous recovery points based on rules, so on an independent bakup set to 1 recovery point then there should be no more than 1 set of recovery points for the last good backup at any given time - this sometimes has issues and files get left behind.

The second tool is http://www.symantec.com/docs/TECH167758 which currently does not work for independent recovery points, hence it wont work for us, result we are not seeing the safety net of that second 'pruning' of recovery points taking place.

Would that about sum it up?

Thanks

criley
Moderator
Moderator
Employee Accredited

I guess so, yes.

Can you get back to us and confirm if you are still having issues with the 'Limit number of recovery points..' option after ensuring you have enough disk space for at least 2 full backups.

PCJunky
Level 4

Day 2 of new 'clean' backup routine and everything is still working as expected, however on one of the two servers backing up I have noticed the following:

Image of Status Page

This backup was first setup on the 2nd and ran at 18:30 pm succesfully, the second backup (last night) ran on the 3rd at 18:30 pm and completred succesfully - now compare those facts with what the status screen is telling us...

Not the first time I have seen this, am I missing something?

Thanks

criley
Moderator
Moderator
Employee Accredited

Hmm, strange. You are certain the backup was successful?

Any errors in the event logs around the time of backup start/end?

What version of SSR (check in Help/About) on this server?

PCJunky
Level 4

Server 1:

 

 

 

 

 

Server 2:

 

 

PCJunky
Level 4

Server 1 backup completed on friday well within the expected time frame and 4 hours before server two's job started.

Server 2 backup started at 5:30 am Saturday morning - and was still running when I checked at about 10 am this morning (Monday), which is considerably longer the projected 2 hours run time.

SSR progress monitor on the main screen showed 95% completion, opening the progress dialogue showed 1% remaining but after an hour it had not changed.

Checking the backup desination folder for server 2 revealed all the backup files to exists and be the size we would expect them to be, which lead me to conclude the backup had completed but hung at the final step when it runs the file clean up of previous backups.

There was no option to cancel the job in the progress dialog so I had to restart the SSR service to break out and stop the job from running, one I did the the completed backup job is there but SSR has no knowledge of this and there is no link to the files created in recovery point manager.

Here is my theory based on what we are seeing here and what I have seen on other servers, SSR most likley place to crash/hang is at the point post backup when it tries to remove previous files, at this point 'something' goes wrong and the nas drops off the network or is at least unavailable from SSR's perspective, the job hangs, previous backups are not cleaned up, SSR does not recoed the job as complete, so the next time it runs SSR has no record of the recovery points and the problem is compounded as they will never be groomed (thats a guess...)

In relation to and supporting this we come across on various occasions SSR reporting to many connection to a backup device i.e. NAS that requires the server to reboot to clear it, during this time SSR cannot access the NAS but windows can.

In most if not all cases as the one above there are log entries to support the job starting, files in place to support the job completing, but no log entries for recovery point removal, or log entries saying recovery point removal failed due to errors accessing the device.

Hope that makes sence...

 

 

criley
Moderator
Moderator
Employee Accredited

Both of those servers are running 9.0.0 which is the original release of BESR 2010. That is old and you should update to SP3 (9.0.3).

If backups are hanging at 95%, please review this:

http://www.symantec.com/docs/TECH141655

RS
Level 3

@Chris

That's OK.

Regarding backing up to a different location.

We did this at first, using RDX (removable harddisk cartridges), but had to resort to using scripts to delete files because of the cleanup problems, along with using an independent recovery point.

 

Backing up to C was noted for testing purposes.

 

I will do a backup to an external drive later today with the same configuration.

Product: Symantec System Recovery 2011
Version: 10.0.1.41704

RS
Level 3

@Chris

Tested now with an external HDD through USB.

Drive: E:
Capacity: 465 GB

Backup Job

  • Folder: E:
    Using an external drive this time.
  • [X] Limit the number of recovery point sets saved for this backup
    Maximum: 1 (one incremental recovery point - Auto Incremental)
  • Start a new recovery point set (base) - Auto Base:
    Monthly

Backup Destination

  • [X] Monitor disk space usage for backup storage.
    Threshold amount of disk space to use for backup storage:
    15,0 GB
  • [X] Automatically optimize storage.
  • [_] Delay changes until next backup.
    Confirm that recovery points will be automatically cleaned up.

This time I ran 5 scheduled backups (changing the schedule, in case there is some special treatment using Run now):

Space currently used for recovery point storage:
20,1 GB

Recall that the threshold is 15,0 GB.

Files in the backup destination (E:):

10,0 GB - NS_External001.v2i (Auto Base)
2,7 MB - NS_External001_i001.iv2i (Auto Incremental)
2,1 MB - NS_External001_i002.iv2i (Auto Incremental)
- Note that it does not stay true to the recovery point limit, which is 1.
1,8 MB - NS_External001_i003.iv2i (Auto Incremental)
10,0 GB - NS_External001_i004.iv2i (Auto Incremental)
- Filled the C drive with the 10 GB backup prior to this scheduled job.
- It still does not consider the threshold, since it's now 5 GB over the limit.-  

criley
Moderator
Moderator
Employee Accredited

- Note that it does not stay true to the recovery point limit, which is 1.

Actually, this is working. The limit is for the set (set = full + all associated incrementals).

I will see if I can test the issue with the threshold here.

Are you using SSR 2011 SP1 (10.0.1)?

RS
Level 3

@Chris

Yeah, I later figured it being exactly what it said; a set of recovery points, so if there are two backup jobs to the same media, it will retain one of them if the limit is 1 recovery point set.

Product: Symantec System Recovery 2011
Version: 10.0.1.41704

No further updates from LiveUpdate.

Thank you.

PCJunky
Level 4

Chris, fair point I'll update them both asap.

Today we found that server 1 backup had completly failed siting the following issue:

Error EC8F17B7: Cannot create recovery points for job: 1_Monday.
Error E7D1001F: Unable to write to file.
Error EBAB03F1: The specified network name is no longer available. 

However server 2 backup had completed succesfully, and its backuing up to a folder in the same share, we do try to stagger them so they dont overlap with server 1 going first then server 2, but in our experiance the NAS would not fail and recover, and there is no history of network issues at this site...

Info 6C8F1F7F: A scheduled independent recovery point of drive C:\ was created successfully.
Details:
Source: Backup Exec System Recovery