Backup Exec 2010 R3 SP2 with Revision 5204 Hotfix ...

kwan88 · ‎08-08-2012

We have been using BackupExec 2010 R3 for almost a year and as soon as we brought the product it has caused nothing but trouble for us. The issue is that the deduplication storage does not perform. We still have about 10 support calls open, and we are still working with Dev and the backup line support to resolve the issue.

What the issue is that our backups to either the CASO or MMS Server which is using its own local deduplciation storage is just in a queue state which can be hours. We were one of the first to apply Hotfix 191248 in July 2012 but this basically did nothing that we can see.

We have around 17 concurrent thread for each deduplication folder but currently we only have 3 active jobs on one folder and 1 active on the other folder. The performance of the server is 11%CPU, memory is 30% used. Our disk queue length is 0.1. So that our backups are sort of working we are having to use disk based backups which experencience no issues but our server was spec with assistance from Symantec but as the dedup is not working correctly, we are having to rely on disk backup and are running out of disk space.

Symantec have ran a number of disk checks which has identied bad media in the dedup folder, but this takes 24 hours to complete. There backup line team have been helpful but theres no solution in sight.

Also we have rebult the server twice and still the same issues.

We have been a customer of Backup Exec for 12 years but the response from the Dev team has been terrible. I even emailed the CEO.

Our case details are below:

OPEN CASES [12] (changes or new in red)

Priority 1: as the backups takes so long and is unreliable.

416-022-872 Backup job on Dedup folder fails with error "A hardware error occurred"

Status: Initial investigation suggested this issue is occurring due to corrupt OST media in Dedup folder

STSINV utility was run and moved the corrupt OST media's in retired media set

The first set of weekly backups and since then all daily backup run successful (except some few jobs which failed because of other errors)

Second set of weekly backups run successfully

Issue no longer seen

Issue most likely related to the 2TB backup job / duplicate job which runs for days.

Symantec enabled detailed debugging for further analysis

Backup and duplicate job run successful with debugging enabled

Issue hasn’t been seen in the last few days

POA: Need to monitor situation and see if the issue reappears and then decide if re-enabling debugging is possible

416-375-655 Duplicate job is prompting to import an OST media which was last used in the backup job and is now offline

Status Backup job was successful and used lots of OST media as it is over 1.6 TB of data.

No verify run

Duplicate job starts to run and then prompts to import OST media xxx.

This OST media used in the backup job is now OFFLINE and because of that there is no chance to put that media back into online or as asked to import it.

Symantec enabled detailed debugging for further analysis

Backup and duplicate job run successful with debugging enabled

Issue hasn’t been seen in the last few days

POA: Need to monitor situation and see if the issue reappears and then decide if re-enabling debugging is possible

416-400-068 Backup to dedup folder fails with: 0xe00084ec - A backup storage read/write error has occurred.

Status: Issue most likely related to the 2TB backup job / duplicate job which runs for days.

Backup and duplicate job run successful with debugging enabled

A different job failed with this error

Logs collected

Issue hasn’t been seen in the last few days

POA: Symantec to analyse collected logs

Priority 2:

416-022-858 Exchange backup on Dedup folder fails with error "-1022 There is a disk IO error"

Status: Exchange Incremental backups are failing with all sort of errors and are slow

Full backup is successful and takes 4 hours

Incremental backup takes till 18 hours or is failing

Customer made the decision starting from today on, to use Differential instead of Incremental

Weekly Full backup run successful

Differential backup is reporting again the same error

3 of 9 SG affected

Not always the same SG affected but in every job which fails there is 1 -3 SG which is failing with that error

Since we moved the time when the backup job is running, no further errors

Full backup started now to report this error

When the same job runs at a different (not so busy) time it is successful

Issue seen again this week on all daily backups

The last few days the Exchange backup run successfully

POA: Monitoring the backups and see if the issue reappears or not

Priority 4:

416-340-993 Exchange duplicate job is failing with error:V-79-57344-33329 - Library - cleaning media was mounted

Status: Exchange backup was never successful so far

Now the backup is successful the duplicate is failing with the above error

Other jobs reporting the same error

Applied Technote: http://www.symantec.com/docs/TECH51468

Issue still seen

Duplicate Jobs from Server 2 to Server 3 work, issue affects duplicate jobs from Server 3 to Server 2

Applied the technote again on both servers

Issue hasn’t been seen anymore in the last few days

POA: Monitoring the situation

Priority 5:

416-341-021 Backup / duplicate jobs to / from dedup folder goes into "device queued" status

Status: Backup / duplicate jobs to / from dedup folder goes into "device queued" status

Other jobs already running are keeping running to the same device.

All new jobs will just sit in queued until the "Device queued" job has been cancelled and then all other jobs are starting to run.

Issue not seen in the last two days

Issue happened again

Issue hasn’t been seen all week

POA: Symantec to keep monitoring and if issue reappears we need to gather new set of logs

Priority 7:

416-345-707 Application event alert on BKUPEXEC: SQL Server has encountered 1 occurrence(s) of cachestore flush for the 'Bound Trees' cachestore

Status: Alert was reported a few times in the last few days

SQL Studio Express installed

Issue as described in: http://www.symantec.com/docs/TECH129073

Technote applied

No alerts seen the last few days

POA: Symantec and customer to monitor situation and confirm issue is solved

Priority 8:

416-345-165 Beserver crash

Status: This crash happened the first time at the weekend of Feb 3rd

Crash has not been seen since

Another beserver crash happened but not identical to the previous one

No crash seen since SP2 got applied

POA: Symantec to keep monitoring the situation

Priority 9:

416-345-115 Media alert: "Do you want to overwrite imported media: OSTxxx" holds up all other backups until it is acknowledged

Status: This alert stopped all backups from continue running until it was acknowledged

Reason why it wasn't automatic cancelled identified - response was set to 3 days. This has now changed to 1 minute

Workaround applied of unchecking the "prompt for imported media alert"

POA: Symantec to clarify:

Why do we ask to overwrite an imported media when it is an OST media

Priority 10:

416-263-851 Media alert on DEDUP folder: "Media is unrecognized. The media in drive '%drive%' is written in an unrecognized format. Erase the media to make it usable"

Status: Alert happened 35 times

These 35 alerts are reported against 3 OST media

Alerts happened again and again about the same 3 OST media which have already been retired

Customer has recreated all jobs (today Friday Feb 3rd)

Alerts happened again over the weekend – same 3 OST media

POA: Workaround applied by removing this alert to being prompted

Symantec to research this issue

Priority 11:

415-876-929 When adding Remote Agent for Direct Access (RADA) the Backup Exec console freeze

Status: The moment customer adding around 10 or more RADA the above issue is seen

Currently customer is not using RADA because of the issue reported in this case.

Symantec explained the underlying issue and workaround:

Workaround is to recycle all beremote services on the remote servers which are using client side deduplication

Beta patch available

Customer not to keen to test a BETA patch.

POA: This issue is currently not the highest priority

Customer would prefer an official hotfix then just a BETA patch

Customer would like to have test results from other customers first before applying a BETA patch to his setup

Priority 12:

415-873-137 CAS Server status shows "Discovering Devices" when restarting services after adding remote servers to RADA

Status: The moment the customer is adding around 10 or more RADA the above issue is seen

Currently customer is not using RADA because of the issue reported in this case.

Symantec explained the underlying issue and workaround:

Workaround is to recycle all beremote services on the remote servers which are using client side deduplication

Beta patch available

Customer not to keen to test a BETA patch.

POA: This issue is currently not the highest priority

Customer would prefer an official hotfix then just a BETA patch

Customer would like to have test results from other customers first before applying a BETA patch to his setup

CLOSED CASES [3]:

416-346-317 Backup job fails and causes the media to go "end marker unreadable".

Status: Only one duplicate job affected

The OST media at the opt-dedup target device will end in that state.

Issue identified and probably caused by the BE 11d agent still running on this one system

Workaround / Solution applied by using a separate policy to both stop the 11D services and to do share level backups followed by duplicates

Case closed March 2nd

416-263-556 Monthly backup are failing with: "0xe0009444 - The requested source duplicate backup sets catalog record could not be found"

Status: Monthly backup did fail with the above error

Issue caused by the fact that we still try to duplicate data older than one months

Symantec adjusted the selection lists for these jobs

Customer has recreated all jobs (today Friday Feb 3rd)

Case closed on Feb 15th

416-264-317 Duplicate jobs to tape library is failing with: 0xe00084ed - A hardware error occurred

Status: Some duplicate jobs did fail because of the above error

Issue is clearly with the Tape Library

Backup are working fine again

Case closed on Feb 15th

VOX

Backup Exec 2010 R3 SP2 with Revision 5204 Hotfix 191248 Dedup Useless