cancel
Showing results for 
Search instead for 
Did you mean: 

'Error: can't query catalog' NBU5230

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello,

 

This should maybe go in the appliance forum but I don't know how many experts look there so we'll keep it here for now.

 

Master W2K8 - 7.5.0.6

Media - NBU 5230 App - 2.5.3

Client - Redhat - Linux 7.5.0.6

 

Started getting this error yesterday on 1 client only. All other clients work fine.

 

Status 84 and then in details 'impl_image_handle: impl_get_imh_image_prop: unexpected error (2060029:authorization failure)'

 

Will post the full details but its not really relevant as when I check the spoold logs on the appliance I noticed that it was having trouble accessing the "database" kept in /disk/database/catalog/2

 

I then found that if I put the specific client in a policy by itself it run fine again. This indicates to me that there is nothing wrong with the client. It creates a new folder structure in /disk/database/catalog/2/CLIENT and goes about its business.

 

So the question is what exactly is going on in this folder structure (/disk/database/catalog/2/) and how do we clean it?

 

Errors from the spad session logs

 

severity: error
server:
source: spad
description: Error: can't query catalog
***DONE***
January 22 09:29:11 INFO [1094347072]: [_handle_find] filter [1|0||-1|-1|/CLIENT/LINUX_BACKUP1|*|*|*|*|*|*|*|*|*|-1|-1|-1|-1|-1|-1|-1|-1|-1|-1|*]
January 22 09:29:11 ERR [1094347072]: 25004: can't open FilePO file /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: Could not load po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: can't get po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: -1: spad request failed:
***ERROR***

 

Any body come accross this before.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi,

 

Support managed to identify and resolve the issue.

 

There appears to be a corrupt image in the directory for this client and particular policy. We see the following in the logs.

January 22 09:29:11 ERR [1094347072]: 25004: can't open FilePO file /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: Could not load po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: can't get po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: -1: spad request failed:
***ERROR***
4
severity: error
server:
source: spad
description: Error: can't query catalog
***DONE***

If we look at the files for the policy we can see corrupt files for this
image, one is 0 bytes and another has some temporary name:-

hct-nbu-app-02:/home/maintenance # ls -la
/disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/
total 4400
drwx------ 2 root root 32768 Jan 29 04:15 .
drwx------ 6 root root 8192 Jan 22 22:06 ..
-rw-r----- 1 root root 76 Dec 26 15:37 __dirpo__
<snip>
<snip>
-rw-r----- 1 root root 0 Jan 22 01:02 CLIENT_1390338067_C1_F1.img
-rw-r----- 1 root root 161 Jan 22 01:02 CLIENT_1390338067_C1_F1.info[R_2]
-rw-r----- 1 root root 162 Jan 22 01:02 CLIENT_1390338067_C1_HDR.img

 

I removed the files relating to that specific image (CLIENT_1390338067) from the /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/ folder and the backups are now working again within the original policy.

 

We don't really know what happened for the file to be created with 0 size but it must have happened in the beginning of the job so it never accepted any data, and there was nothing in the NBU catalog. So I dindn't have to expire this image. No dataloss.

View solution in original post

6 REPLIES 6

Mark_Solutions
Level 6
Partner Accredited Certified

Catalog errors like this are usually down to a case sensitivity issue they usually show up during a duplication, restore or verify rather than during a backup

As you can see from the structure it all goes under catalog\2\ followed by the client name and then the policy name

If in a policy the name gets changed or the policy gets changed (same names but different case) then it can make things go wrong.

You need to take a look to see if you have both CLIENT and client first - if so the files from the upper case on need to be copied into the folder structure of the lower case one - then the upper case one deleted - this can eb tricky sometimes if you have the same named files in both - if so raise a support case

Once you just have one directory we add a softlink to the catalog\2\ directory

As yours is currently in UPPER case you would just do the following:

cd into the catalog/2 directory

ln -s client CLIENT

I have the feeling, as you say "if you put it in a policy on its own" that the case sensitivity issue may be at policy level rather than client level so what you may need to do it to go down one level deeper and add a softlink to the catalog folder

Now case sensitivity was supposed to be fixed in 7.5.0.6 which you get with 2.5.3 - so it shouldnt be happening to you!

What it may be is that the structure within the policy folder under catalog\2\ has become corrupt and so the client will only work when it is in a different policy - the answer may be to just create a new policy and put the clients in that so that it gets a new structure

As there is this possible corruption it would be worth logging a call to get it investigated further as they have a recoverCR tool that can be supplied to check and correct your de-dupe database which may sort it all out for you.

Hope some of this helps and makes sense!

 

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi Mark,

 

Thanks, yes I'm aware of the CASE issues, I've fixed many of those.

 

I suppose it must be corrupt as it works for the new one.

 

Will see if support can clean it up.

Mark_Solutions
Level 6
Partner Accredited Certified

I think it is worth checking

A new policy should be Ok as a worka round while you wait - but it may well be that you cannot restore form the old backups that used that policy either if there is corrution there

You could run a verify against an older image - if it fails you need to get it fixed

Keep us updated on what support find - it may just be a rogue file in the directory - anything that shouldnt be there will stop the whole directory working

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

I've check for dodgy files that shouldn't be there. I'll try a verify, didn't think of that.

 

Thanks

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi,

 

Support managed to identify and resolve the issue.

 

There appears to be a corrupt image in the directory for this client and particular policy. We see the following in the logs.

January 22 09:29:11 ERR [1094347072]: 25004: can't open FilePO file /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: Could not load po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: 25004: can't get po /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/CLIENT_1390338067_C1_F1.img
January 22 09:29:11 ERR [1094347072]: -1: spad request failed:
***ERROR***
4
severity: error
server:
source: spad
description: Error: can't query catalog
***DONE***

If we look at the files for the policy we can see corrupt files for this
image, one is 0 bytes and another has some temporary name:-

hct-nbu-app-02:/home/maintenance # ls -la
/disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/
total 4400
drwx------ 2 root root 32768 Jan 29 04:15 .
drwx------ 6 root root 8192 Jan 22 22:06 ..
-rw-r----- 1 root root 76 Dec 26 15:37 __dirpo__
<snip>
<snip>
-rw-r----- 1 root root 0 Jan 22 01:02 CLIENT_1390338067_C1_F1.img
-rw-r----- 1 root root 161 Jan 22 01:02 CLIENT_1390338067_C1_F1.info[R_2]
-rw-r----- 1 root root 162 Jan 22 01:02 CLIENT_1390338067_C1_HDR.img

 

I removed the files relating to that specific image (CLIENT_1390338067) from the /disk/databases/catalog/2/CLIENT/LINUX_BACKUP1/ folder and the backups are now working again within the original policy.

 

We don't really know what happened for the file to be created with 0 size but it must have happened in the beginning of the job so it never accepted any data, and there was nothing in the NBU catalog. So I dindn't have to expire this image. No dataloss.

Mark_Solutions
Level 6
Partner Accredited Certified

Glad it is sorted - pretty much what i said on 23rd Jan when i said that the files in that location may have become corrupted which stop it working for that policy

Good to know you are up and running