cancel
Showing results for 
Search instead for 
Did you mean: 

Azure backups hanging at Loading Media

gcpato
Level 3

We had setup several backup jobs to Azure using Backup Exec 16.  All was working well for a couple of weeks, but recently we are now having an issue where most or all backup jobs are getting stuck at Loading Media...

When this issue occurs, the Backup Exec software runs, but the services appear to be hung.  For example, if there are scheduled jobs that need to run during this time, they do not run.  During this condition, any attempts we make to change/cancel/hold the jobs or device settings are unsuccessful (timeout); however, the Elapsed Time on the running jobs continues to increase and we can click around the various tabs in the BE software.  Also during this condition, the pvlsvr.exe (Backup Exec Device and Media Service) is using CPU but does appear to be hung, because any attempts to stop or restart the Backup Exec services fail/timeout.  We can either reboot the server, or forcably kill the backup exec service processes to break out of this condition.  We have let the condition sit for over 24 hours to see if it would eventually clear, but it did not.

The Azure storage logs show various errors on GetBlob and DeleteBlog requests for various files like:

1.0;2017-05-30T14:21:41.9770882Z;DeleteBlob;ClientOtherError;404;6;6;authenticated;**storageaccountname**;**storageaccountname**;blob;"https://**storageaccountname**.blob.core.windows.net:443/container/BEOST_00000196/META_BLOCK_MAP_FILE";"/**storageaccountname**/container/BEOST_00000196/META_BLOCK_MAP_FILE";815c11b7-0001-00f9-2150-d9e47b000000;0;**IP**:62534;2015-02-21;354;0;145;215;0;;;;;;"APN/1.0 Veritas/1.0 BackupExec/16.0";;
1.0;2017-05-30T14:20:57.7580368Z;GetBlob;ClientOtherError;404;78;78;authenticated;**storageaccountname**;**storageaccountname**;blob;"https://**storageaccountname**.blob.core.windows.net:443/container/BEOST_SDrv5/META_BLOCK_MAP_FILE";"/**storageaccountname**/container/BEOST_SDrv5/META_BLOCK_MAP_FILE";e32ceac0-0001-0039-484f-d96e3f000000;0;**IP**:62515;2015-02-21;348;0;145;215;0;;;;;;"APN/1.0 Veritas/1.0 BackupExec/16.0";;

In the adamm.log file, we see these logs:

[03388] 05/30/17 10:20:57.696 DeviceIo: beost: Critical: Azure: CAzureManager::open_image:
[03388] 05/30/17 10:21:32.495 DeviceIo: ostscsi: move error image BEOST_00000196 could not be get_proped 2060018, attempting to recreate

This same issue is happening on 2 different Backup Exec servers.  The servers are pointing to 2 separate Azure storage accounts (and 2 separate containers) and they are the only software using these storage accounts.

The data is being Duplicated from a Data Domain appliance, but we have even experienced this issue with a test backup job directly to Azure.  It also does not seem to matter whether the job is VMware VMDK files or SQL data, etc.  The issue happens regardless.

I do have a case open with Backup Exec, but we have not gotten anywhere yet.  The ticket has been escalated.

We have tried setting the "concurrent operations" setting to various things, such as 1 or 5 or 8, with no effect on the issue.

 

Has anyone else seen this issue or can you think of anything else we should check?

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

Ultimately, I determined that the problem was related to how our Data Domain was setup.  

Both backup servers were pointed to the same Data Domain logical storage unit.  Strangely, we only had problems with this setup when we began using Azure.

After creating a new Data Domain storage unit and switching one of the servers to it, everything is working very well now.

View solution in original post

6 REPLIES 6

Colin_Weaver
Moderator
Moderator
Employee Accredited Certified

This is probably going to need a formal support case (unless someone on the forums has seen it before)

We've had a case open since last week and haven't gotten anywhere yet.  It's been escalated and they are analyzing logs at the moment.

Colin_Weaver
Moderator
Moderator
Employee Accredited Certified

Can you send me the case number (as a private message)

I believe I have figured out how to avoid the jobs hanging at Loading Media.  The fix appears to be to only have a single container in the Azure storage account.  Having multiple containers seems to cause BE to randomly hang on Loading Media.  Maybe someone else can test this to confirm?

I had discovered this problem a couple of weeks ago, and corrected it.  However, we also had an issue where Duplicate jobs were failing at Byte Count of 50GB, 100GB, etc (probably when a new backup file is loaded).  In order to troubleshoot that issue, I had enabled diagnostic logging on the storage account.  What I just realized yesterday is that enabling the logging actually creates another container in the storage account to store the logs in, causing the Loading Media hang issue.

I've created a new storage account but did not enable logging on it.  I am now able to run jobs to Azure again, however, we are still experiencing the issue with the jobs randomly failing.  I will create a separate thread for that issue.

 

Thanks

Scratch that.  Over the weekend, we had an Azure job hang at Loading Media and hang all other jobs that were running at the time.  This is with a new storage account which only has one container.

Once again, at the very end of the adamm.log file, we have this line shortly after the Azure job started:

[04476] 06/05/17 03:06:32.375 DeviceIo: ostscsi: move error image BEOST_00000197 could not be get_proped 2060018, attempting to recreate

 

Ultimately, I determined that the problem was related to how our Data Domain was setup.  

Both backup servers were pointed to the same Data Domain logical storage unit.  Strangely, we only had problems with this setup when we began using Azure.

After creating a new Data Domain storage unit and switching one of the servers to it, everything is working very well now.