cancel
Showing results for 
Search instead for 
Did you mean: 

Backup completed partially

Sagar_Kolhe
Level 6

Hello All,

 

I am facing on partially backup issue. Since yesterday , backup has completing partially. Kindly find the error and suggest the same.

 

Error bpbrm (pid=8495136) from client xxxx: ERR - Read error at block 177479801 reading 65536 bytes in file /dumps0/BACKUP/preeod/UBSFCR/full/FCRpreeod.dmp. Errno = 110: The media surface is damaged.

 

 

Thanks and Regards,

Sagar

2 ACCEPTED SOLUTIONS

Accepted Solutions

mph999
Level 6
Employee Accredited

I'm going to go 'Maybe' with this one ...

I would say that a surface error would lead to a read error, but a read error would not necessarily lead to a surface area.

As mentioned above, it is a big assumption to make, that some read area is going to be reported as a surface error.

Or looking at it another way, we don't see surface errors reported for every read error (I'm talking generally here, across all read errors we see ...)  - so I'm thinking that in this case it knows more than has been found so far, which is why I am thinking there might be some scsi sense errors (asc/ ascq) which would be coming from the drive, and would be more accurately be able to suggest there is some surface issue.

The other possibility, of course is that is that I am wrong ....  :0(

View solution in original post

Nicolai
Moderator
Moderator
Partner    VIP   

In theory a SCSI read error and a read error caused by incorrect formatted data should be different.

So the surface error *may* indicate a SCSI error of some sort, and not a read error caused by invalid data format - e.g. file system corruption.

Is that what you are have in mind in the previous post ?

View solution in original post

16 REPLIES 16

Nicolai
Moderator
Moderator
Partner    VIP   

Likely this file is being changed at time of backup. Looking at the file path is look like a SQL dump of the database UBSFCR.

From a Netbackup perspective there is not much you can do other than take a talk with the SQL admin to figure it out when the SQL dump are being made.

It could also be a file system corruption - but from my experience its less likely.

Sagar_Kolhe
Level 6

Hi Nicolai,

 

We have tried this backup 3 different times today but no result. Is it any logs to check the issue so that we can approach to DBA team.

 

Regards,

Sagar

mph999
Level 6
Employee Accredited

You have the error in the line that you posted in your first post, I doubt you'll get any more information than that.

You see the line says, 'from client', so the client sent that error to bpbrm, but it will be the bpbkar process that sent it, so you could look in the bpbkar log on the client but I suspect you will only see the same line.

Can you copy that file, or do you get a similar error, are you able to stop the database and does the backup then work.

It's not a netbackup error, it's a Windows error, NBU is only reporting it, which is why the NBU logs won't help much.

From

http://msdn.microsoft.com/en-us/library/windows/desktop/ms681382(v=vs.85).aspx

ERROR_OPEN_FAILED
110 (0x6E)

The system cannot open the device or file specified.

I'm a bit surprised it says media surface damaged, that's one he'll of a conclusion to jump to just because a file cannot be opened, though that said, I would have thought you would see a file busy message.

Worth checking the bpbkar log, look for file busy and also any lines that contain asc or ascq errors, as if it really is media surface damaged you might see such a line.

Sagar_Kolhe
Level 6

HI Mph,

 

This is AIX server , please suggest for unix servers and you are right , bpbkar log stated the same thing which shows in detailed status. We have logged the case with symantec , I will update here if i get any reply from them.

 

 

Regards,

Sagar

Marianne
Level 6
Partner    VIP    Accredited Certified
Use normal OS backup commands like tar to read the folder and back it up to /dev/null. Or do a simple copy to other filesystem. I would also look in AIX errpt for issues with this filesystem.

mph999
Level 6
Employee Accredited

Sorry, it is Unix, apologies I missed the fact that you showed / and not \ ...

If you post the case number I will take a look when I get time, TBH, I don't think we're going to be able to help much from the NetBackup side, as Marianne says, looks like a hardware error or filesystem error to me, hence why I suggested to see if you could copy the files.

I'm not sure what error 110 relates to on AIX, on Solaris if you do man -s2 intro you get a list of errors, and I was hoping that being unix, they might be the same, but seems not, as 110 is not listed.

errpt -a might show something as Marianne suggets.  If there media surface is damaged I would expect to see an error like this :

key = 0x3, asc = 0xyy, ascq = 0xzz  - (with yy and zz being some number)

jim_dalton
Level 6

As mvdb says, get rid of netbackup altogether and tar the file in question to tape or another disk and I'm betting that will fail, hence nothing to do with netbackup. Equally I'm surprised its pointing out a media error, but its been years since  I went near AIX.Jim

Nicolai
Moderator
Moderator
Partner    VIP   

According to this page both error and text "110 The media surface is damaged"  is correct

http://www.ioplex.com/~miallen/errcmp.html

But I think its fairly safe to consulde it the same as a read error from the file system.

mph999
Level 6
Employee Accredited

I'm going to go 'Maybe' with this one ...

I would say that a surface error would lead to a read error, but a read error would not necessarily lead to a surface area.

As mentioned above, it is a big assumption to make, that some read area is going to be reported as a surface error.

Or looking at it another way, we don't see surface errors reported for every read error (I'm talking generally here, across all read errors we see ...)  - so I'm thinking that in this case it knows more than has been found so far, which is why I am thinking there might be some scsi sense errors (asc/ ascq) which would be coming from the drive, and would be more accurately be able to suggest there is some surface issue.

The other possibility, of course is that is that I am wrong ....  :0(

mph999
Level 6
Employee Accredited

PS. Nice link by the way, like that ....

mph999
Level 6
Employee Accredited

Unless I mis-understood of course ...

Yes, a filesystem tread error and a surface read error would cause the same probelm ...

Nicolai
Moderator
Moderator
Partner    VIP   

In theory a SCSI read error and a read error caused by incorrect formatted data should be different.

So the surface error *may* indicate a SCSI error of some sort, and not a read error caused by invalid data format - e.g. file system corruption.

Is that what you are have in mind in the previous post ?

mph999
Level 6
Employee Accredited

Yes ...

Nicolai
Moderator
Moderator
Partner    VIP   

ACK :)

Sagar_Kolhe
Level 6

Hi All,

There was hardware issue(mount point issue). we have raised this query to OS team and they confirmed that it was mount point issue(FSCK failed at some point). Now they have taken the new database dump on other mount point and now backup is happening properly.

 

Thanks for your views and technotes.

 

Regards,

Sagar

Marianne
Level 6
Partner    VIP    Accredited Certified
Trust NetBackup to reveal issues/errors/faults in your environment! Please remember to mark Solution(s) for post(s) that pointed you in the right direction.