cancel
Showing results for 
Search instead for 
Did you mean: 

Duplications fail with media open error(83)

Beavisrulz
Level 4

Hello, we're having random issues where a few duplications fail each day out of an easy hundred or so. Here's our current configuration:

- NetBackup 7.1.0.3 on master, all media servers, and most clients

- NetBackup Disk Appliance 5020, set at production site and set at DR site, all running v1.4.1.1, and 6.6.3.49048

- Master server running on Windows 2008 R2 SP1 64bit.  Media servers are Windows 2003 32bit.

We are backing up to our local production appliances and duplicating to the DR appliances. Several times a day, duplication jobs fail with the "media open error(83)" status message. It's appears to be the same client, dupe job runs and fails 4 times within a short period, then waits 10 hours and tries and fails again. Here is a copy of the backup status log:

6/12/2012 12:40:22 PM - requesting resource LCM_KMC_Pool_1_All
6/12/2012 12:40:22 PM - granted resource LCM_KMC_Pool_1_All
6/12/2012 12:40:22 PM - started process RUNCMD (8096)
6/12/2012 12:40:22 PM - requesting resource @aaaan
6/12/2012 12:40:22 PM - reserving resource @aaaan
6/12/2012 12:40:22 PM - reserved resource @aaaan
6/12/2012 12:40:22 PM - granted resource MediaID=@aaaan;DiskVolume=PureDiskVolume;DiskPool=KMC_Pool_1;Path=PureDiskVolume;StorageServer=khapb...
6/12/2012 12:40:23 PM - Info bpdm(pid=4356) started           
6/12/2012 12:40:23 PM - started process bpdm (4356)
6/12/2012 12:40:48 PM - Critical bpdm(pid=4356) get image properties failed: error 2060013: no more entries   
6/12/2012 12:40:48 PM - Error (pid=8096) ReplicationJob::Replicate: Replication failed for backup id khfsvs34_1339468206: media open error (83) 
6/12/2012 12:40:48 PMReplicate failed for backup id khfsvs34_1339468206 with status 83
6/12/2012 12:40:48 PM - end operation
media open error(83)

 

I've searched the forums and knowledgebase to no avail. Any ideas? Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions

Mark_Solutions
Level 6
Partner Accredited Certified

Back after a 3 week break (needed it!!) and just picking up on things.

Just a thought on the difference between linking and renaming when this issues ocurr ....

A link makes both upper and lower case versions valid where as a rename only makes one valid

The case sensitivity only shows up when we duplicate or verify the back images - the trouble is that this will also apply to restoring them .. so if we use the link method then restores from when the client or policy was in one case will always work - if we rename we may find an issue in the future when we come to do restores from old backups.

I haven't been in a position to test this but I would still prefer the link method just to be on the safe side as, after all, we only back things up in order to be able to restore them!

Hope this gives more food for thought

View solution in original post

21 REPLIES 21

SAF
Level 3
Partner Accredited Certified

Mark_Solutions
Level 6
Partner Accredited Certified

I have also seen this when an image has been corrupted (glitch in the system somewhere)

In that case you have no choice but "loose" that backup

You would need to cancel that backup id from the SLP to prevent it being duplicated (nbstlutil cancel .... -backupid ***)

You can then expire it via the Catalog Section

Not a great solution but stops it constantly failing if that is all it will ever do.

You could try duplicting it manually once cancelled from the SLP but that depends if you are using Opt De-Dupe or AIR for replication (guessing you are using AIR in which case you could not do such a duplication)

Hope this helps

Beavisrulz
Level 4

Yeah, I've canceled them before, but just don't understand why I get them on a daily basis. It just started happened more frequently over the past few weeks and we have not changed anything.

I will keep an eye on it some more and see if it's coming from the same client, media server, etc.

Thanks!

Mark_Solutions
Level 6
Partner Accredited Certified

OK - keep an eye on the All Log Entries report - maybe you disk pools are going Up and Down or similar and something needs tuning or repairing

You may also want to check for errors on the 5020's (/Storage/log;  /Storage/log/spoold/storaged.log;  /Storage/log/pdwfe.log;  /Storage/log/agent.log and /Storage/tmp/workflow.xxx)

Hope this helps

RVellore
Level 2
run this command on the master server: /usr/openv/netbackup/bin/admincmd/nbdelete -allvolumes -force then run a cleanup job /usr/openv/netbackup/bin/admincmd/bpimage -cleanup -allclients

RVellore
Level 2
Sorry ERROR 83

SAF
Level 3
Partner Accredited Certified

Any update on this case?

Beavisrulz
Level 4

We still do have them occasionally, but it's dropped down a little. I did receive one last night.

We had to apply an EEB to our production appliances to fix an issue with the content router crashing, and I'm planning on applying 1.4.2 early next week. We are then scheduled to upgrade NetBackup to 7.5 either Thursday or the following week.

I will update the thread with more info once we get to that point and see if the issue still exists.

Thanks!

AV-IT
Level 3
Partner Accredited

Hi,

Any uodates on this case? Has it been been resolved with 7.5?

zafar1907
Level 3
Partner Accredited

Hi,

BUG REPORT: SLP duplication fails with read failed, media open error (83).

http://www.symantec.com/business/support/index?page=content&pmv=print&impressions=&viewlocale=&id=TECH160733

zafar1907
Level 3
Partner Accredited

I have found one more link hope this will help you.

http://www.symantec.com/business/support/index?page=content&id=TECH176616

fjacquet94250
Level 2
Partner Accredited Certified

I think I have the same with NBU5200  2.5.2 ( so NBU 7.5.0.5 ) same message for me

 

Another regression ?

 

28.03.2013 06:52:04 - requesting resource LCM_stu_disk_chback01
28.03.2013 06:52:04 - granted resource LCM_stu_disk_chback01
28.03.2013 06:52:05 - started process RUNCMD (25479)
28.03.2013 06:52:05 - requesting resource @aaaad
28.03.2013 06:52:05 - reserving resource @aaaad
28.03.2013 06:52:06 - reserved resource @aaaad
28.03.2013 06:52:06 - granted resource MediaID=@aaaad;DiskVolume=PureDiskVolume;DiskPool=dp_disk_chback01;Path=PureDiskVolume;StorageServer=chback01;MediaServer=chback01
28.03.2013 06:52:07 - Info bpdm(pid=25492) started            
28.03.2013 06:52:07 - started process bpdm (25492)
28.03.2013 06:52:09 - Critical bpdm(pid=25492) get image properties failed: error 2060013: no more entries    
28.03.2013 06:52:09 - Error nbreplicate(pid=25479) ReplicationJob::Replicate: Replication failed for backup id LSFP01_1364425226: media open error (83)  
28.03.2013 06:52:09Replicate failed for backup id LSFP01_1364425226 with status 83
28.03.2013 06:52:09 - end operation
media open error(83)

Jeff_Foglietta
Level 5
Partner Accredited Certified

I have just come across an issue similar to the one Mark refers to regarding the "case" of the client name in the "/disk/databases/catalog/2" directory of an appliance.

However, rather than the hostname of the client being out of synch regarding case it was the underlying policy name that was out of synch.

Navigate to the directory as stated above. Find the client that the replication is failing on. perform a directory listing on that client and note the "policy name" subdirectory. If the name does not match case with the current policy name, your replications will fail with an error 83 and a media open error 13.

To resolve this simply change directory to the failing hostname directory.

# cd /disk/databases/catalog/2/<Failing Host>

Then change the name of the directory to reflect the current policy name

# mv <Directory Name in wrong case>  <Directory Name in correct case>

Replications will now proceed normally.

smakovits
Level 6

What is the difference between this method and another that Mark mentions in another thread where he is creating a link to a folder as opposed to renaming it?

 

ln -s /disk/databases/catalog/2/<Directory Name in wrong case> /disk/databases/catalog/2/<Directory Name in correct case>

Jeff_Foglietta
Level 5
Partner Accredited Certified

They address two different situations entirely.  Mark is referring to hostnames in different case requiring a link to open the images where I am referring to policy names that don't match case likely because the policy was copied and renamed to a different policy with only case change and the original deleted.  i.e.  WINDOWS_OS ---- > Windows_OS ...

smakovits
Level 6

Ah, makes sense.  You are running the move command inside the failing client folder, while his link is at the /2/ folder location.

So stupid question, what stops you from creating a link here as well as opposed to moving the folder?  Would the end result not be similar to the other case?

Jeff_Foglietta
Level 5
Partner Accredited Certified

You could create a link but why would you? From an administrative viewpoint it is much cleaner to rename the directory via the move command. In the case where there are duplicate hostnames it would make more sense to link due to the underlying structures.

smakovits
Level 6

I guess I was not thinking of it as duplicate host names.  For instance, we have some physical servers that were converted to virtual.  For physical servers we always added them to policies using lower case host names.  For the VMs, we use display name which is upper case.  So, in my case, f01 and F01 are the same file server, but because it was originally backed up lower case, I think NBU re-used the folder, but the SLP was looking for upper case resulting in the same status 83.

As noted by Mark, good thing this is still an issue in 2.5.2...

In the end, at least for me in this case, I think I can use your method and do away with the link all together.  It is 2 folders and I can confirm that the link method worked as both SLPs completed now.

Mark_Solutions
Level 6
Partner Accredited Certified

Back after a 3 week break (needed it!!) and just picking up on things.

Just a thought on the difference between linking and renaming when this issues ocurr ....

A link makes both upper and lower case versions valid where as a rename only makes one valid

The case sensitivity only shows up when we duplicate or verify the back images - the trouble is that this will also apply to restoring them .. so if we use the link method then restores from when the client or policy was in one case will always work - if we rename we may find an issue in the future when we come to do restores from old backups.

I haven't been in a position to test this but I would still prefer the link method just to be on the safe side as, after all, we only back things up in order to be able to restore them!

Hope this gives more food for thought