cancel
Showing results for 
Search instead for 
Did you mean: 

After SDR success backup set not displayed in storage and catalog of GRT backup set fails

M_Strong
Level 4

Hello again,

     I've been encountering some odd behavior seemingly at random after some backups complete.  The issue seems to occur primarily with GRT backups, however is not limited to GRT backups as I've seen the same behavior when non-GRT backups complete.  I can only assume that the actual backup process completes successfully as the system generates an alert notification of ;SDR Success...' and I see the correct amount of data (xx GB) was backed up during a given job, however the actual backup job is reported as 'Failed' with an error code of E0000602 which is ...query for media sequence...unsuccessful...  I can't be certian that the non-GRT backups are the same exact root cause, but they report the same error code even though I have an SDR Success alert corresponding to the backup that has reported as 'Failed'.
     I have a feeling that the error code may be a bit of a red herring in this situation as well; after the backup completes, generates the 'SDR Success' message and begins to verify, the verification process can't find some of the backup sets created by the recently completed backup job.  As such, the entire job fails.  All jobs in question are B2D jobs, so I don't have any reason to suspect any configuration issues as we've had the same configuration and backup jobs running weekly for several years now and this behavior has only just begun to occur within the past few months.

     What is leading me to believe that there is something else going on here is that consistantly, when this behavior occurs, I can look at the backup sets list in the B2D storage location that was used for the failed job and some of the backup sets that I expect to see are missing.  This is reliabally the case when our Exchange backup reports failure after 'SDR Success...' alert is generated, I check the B2D storage location and I can see the 'system state' and individual Drive letters were backed up, however I don't see the mailbox or public folder databases backup sets.  I'll then check the actual file system (Windows Explorer) and verify that there are indeed img###### folders which contain the mailbox and public folder databases - created at the date/time of the backup job that failed.  If I then inventory/catalog the B2D location from within the storage view of the BE GUI, the missing backup sets appear.  I can then manuallly re-run the failed catalog job (corresponding to the failed GRT backup job) and everything succeeds as if nothing was ever wrong.

     So, as far as I can tell, the first behavioral problem contributing to this issue is that when the backup completes, the backup sets that were just created are not being listed in the backup sets list in storage - before the verification process for that particular backup job begins.  As such, the verification process can't find one or more expected backup sets and fails the job with the E0000602 error code citing that it can't find the media.  A manual inventory/catalog of the storage location finds the missing backup sets, lists them in the backup sets view in the GUI and then all is happy once again in backup land until the next failure occurs.

     At one point I was under the impresssion that it had to do with the fact that some backup jobs run a little long sometimes and due to our short backup job expiration times the media was being deleted before the verification process could complete.  (We B2D2T all the time, so having short expiration for the B2D portion keeps the B2D locations empty and available for subsequent backup jobs).  This however is not the case since I'm beginning to see the same behavior occur more frequently when there is only one backup job running with no duplciate backup sets linked job and so the expiration should make no difference since BE won't delete the most recent available backup unless it has been duplicated by BE to other media.

     So in short, the issue here seems to be that Backup Exec is not properly adding newly created backup sets to the storage inventory in all cases, seemingly at random.  The bahavior does not occur only for one backup job and occurs for both GRT and non-GRT backup jobs.  After a failure is reported, I can manually inventory/catalog the storage location containing the backup sets from the failed job which fills in the missing backup sets and allows a manual re-run of the catalog operation to succeed.

     Any chance anyone has seen this before and/or is this possibly a known issue?  If so, I could use a fix or a handy work-around so I don't have to continually baby-sit my backup jobs and re-inventory the B2D storage locations sevaral times a week.

 

         Much appreciate any insight,
                     Thank you,

5 REPLIES 5

M_Strong
Level 4

(P.S.  This is unrelated to migrating the B2D volumes I'd recently inquired about.  I'm only just about to begin that process and I don't expect anything to change when complete since that change is hopefully going to be transparent to BE.  These inventory/backup set issues have been ongoing weekly for a few months now.)

M_Strong
Level 4

Hello?  Bueller?  Anyone?  Bueller?

Another couple of observations which might help nail down a root cause:


--It seems that whenever a server backup fails due to this anomoly, what ends up missing is the C: volume.  (System State is backed up successfully)  Now, I've double and tripple checked that for each of these backup job definitions, the SDR is turned on, the C: volume is selected and there is nothing wonky in the Selection Details tab of the backup selections edit window.

--It seems that the correct amount of data is being backed up, when I compare a 'failed' backup job to a successful backup job, the amount of data in GB are very close, to within 1 GB usually (which is normal for a full system backup as files are added/deleted/compressed, etc due to operations such as Windows updates, Logfiles rolling, etc.), so it seems that the backup portion is actually succeeding.

--During a backup job that I suspect might fail due to this anomoly - in the Windows File Explorer I can see backup sets (.bkf files) that I believe should correspond to the various volumes that a server has, so again I suspect that the backup process is succeeding for the most part.  However, it seems (as stated above) the C: volume(s) of the servers(s) that experience this failure never show up in the 'backup sets' list in BE GUI.  In conjunction with that, it seems that the catalog process that fails - after the GRT backups that fail - is looking for media GUIDs that don't exist in the backup.  When I review the job log for the failed catalog operation that is linked to the failed backup, I can see that the catalog is looking for Media having GUIDs that don't show in the backup job's log.  The backup job's log provides details about the media including a Media GUID and which media set files (.bkf files and IMG folders) belong to that GUID.  The GUIDs that appear in the catalog operation's job log do not appear in the backup job log - and likewise, the Media GUIDs that appear in the backup job's log do not appear in the catalog operation's job log.
Now not all backup jobs that fail due to this anomoly have a catalog operation, only the email server's backup job has a linked catalog operation.  The other 2 servers that fall into the scope of this condition do not have any linked catalog operation.  The other 2 servers are SQL and a DC/File server.  I guess in retrospect, all 3 of the servers that regurlarly exhibit this symptom seem to be utilizing GRT backup methodology, so there may be something to that commonality as well, however not ALL GRT backups exhibit this symptom, consistantly the same servers though.

Near as I can figure at this point based on my observations above is there may be something about the C: volume on each of these servers that is preventing the backup operation from completing, either a file on the C: volume that is locked or some other process that is interefering with the BE Agent being able to reference what it needs to reference for being able to backup the C: volume.  System State is backed up properly and I have not found any entries in the various Windows event log(s) during the time-frame(s) which the backups that failed have occured - which points me in the direction of the root cause.  I don't see any evidence of failures or other errors.  All servers are patched and rebooted regurlarly.  All BE services on the servers being backed up are running and are stable to the best of my knowledge.  VSS writers are also healthy.  Not quite sure where else to look regarding this one.

 

              Any Available Insight Appreciated,

                          Thank you,

VJware
Level 6
Employee Accredited Certified

Is this error generated during the "verify" portion of the backup job ?

You may be facing this issue as well - https://support.symantec.com/en_US/article.TECH230359.html

I would recommend logging a formal support case for an engineer to have a look @ your setup.

Colin_Weaver
Moderator
Moderator
Employee Accredited Certified

Has the delayed catalog run against the GRT sets OR can you turn delayed cataloging off as a test to see if it changes your symptpms.

If a delayed catalog has not run then the GRT sets will not be shown until it has and teh deayed catalog will run as a separate job.

 

Note if your are actually getting errors during the GRT parts of the backup then this is not directly related to a delayed catalog setup (but may means that we do not even attdmpt a delayed catalog because of the error), so you would really need to look into the error in that instance.

M_Strong
Level 4

To answer in turn:

VJware:  I don't watch the entirety of the backup job as it takes several hours and I'm not always looking at it when the error occurs, but based on what I've seen so far, the error usually occurs durring or at the end of the verify portion of the backup operation.  I typically receive an 'SDR Backup Success' message in the general alerts list and attached to the B2D storage volume when the backup completes, so as far as I can tell, the backup is completing successfully including all volumes.
That said, when I look at the storage during the backup/verify I am not seeing backup sets that correspond to all of the volumes that I am expecting to see (I'm expecting to see 3 volumes + System State; C: (OS); D: (DB Files) and E: (DB Logfiles) as well as the GRT backup of the Mailbox and Public folder databases (2 more backup sets) for a total of 6 backup sets for this server).  I usually only see System State, D: and E:.  For some reason (when the issue is going to occur) I don't see C: in the backup sets list.

To Colin's point (and to answer his question) I usually see the GRT backup sets in the storage before the job completes but not always.  There is a seperate (delayed, but only slightly) Catalog operation against the particular backup job that just completed, usually runs immediately after the job (with verify) completes.  Whether or not that Catalog operation succeeds seems to be linked to the success of the verification portion of the backup job or it is coincidental that whatever the root cause of the verification failure is also causing the Catalog operation to fail.
When the Catalog operation fails, there are GRT images on the physical storage (I can see the IMG######### folders in Windows Explorer and within those folders are clearly 'Mailbox Database.edb' and 'Public Folder Database.edb') but they are not visible in the BE Storage Tab for that B2D volume.

As an aside - outside the scope of any one particular failure, but related to the underlying issue I suspect - if I run a manual Inventory and Catalog operation against one of the B2D storage locations (we have 3 for providing parallelism and increased pathways to the underlying iSCSI storage) I may have old GRT backup sets appear where they were not previously visible in the storage tab of BE.  Typically the dates of those backup sets correspond to one of the backup job failures, so it lends more to the observation that backup sets are being created by the backup job but are not inventoried correctly when that particular backup set completes and so are not visible in Storage and for some reason are not available to BE during the verify phase and likewise the Catalog operation.

This may be caued by the issue mentioned in the article linked to by VJware and I've been attempting to adjust the retention period to allow for that circumstance.  Sometimes it seems that whenever I adjust the retention period based on a previous failure (to prevent the situation described in the article) the next running of the backup job takes even longer than the previous one.  That has to be only coincidence, but I can proceed to set some rediculously high retention period number and manually expire backup sets until we have this ironed out.  That is giong to cause a bunch of additional work for me but so be it in the interest of troubleshooting.
Actually, in thinking more about the possibility that it is that bug - I would be more inclined to believe that the bug were the root cause if it were random backup sets being deleted and/or any/all backup sets that were expired during the process of the backup were deleted.  However it seems that only the C: volume is ever missing from the backup sets which is what I suspect is causing the verification to fail.  (Between that and the sometimes absent GRT backup sets.)
Unfortunately, when the failure occurs it doesn't give me any particular details on what caused the failure - whether it was a missing backup set and which one.  At least the job log doesn't provide those details.  If there is a debug log (I'm pretty sure there is, but it has to be enabled manually I believe) those may have more details on exactly what is going on.

Additoinally, none of this explains why the Catalog operation is looking for a media GUID that doesn't exist in the backup set that immediately preceeded it - to which I am under the impression it should be linked.