cancel
Showing results for 
Search instead for 
Did you mean: 

Operation requested by an invalid server (37)

Aurelien59
Level 4

Hi all,

our media server (puredisk) had a problem with one of the hard drives in its storage.

We replaced this drive then rebuilt the raid, no apparent problem.

But since then, all the backups which should run on this media server fall with the error 2074.

When I try to update the Disk_Pool I have an error 37 (see the attached screenshot).

I have already checked for the presence of this file <Install_path>\NetBackup\bin\ost-plugins\srvrname.cfg

I have already managed to put the Disk_Pool UP with command.

Also checked this :

https://vox.veritas.com/t5/NetBackup/Error-Code-37-operation-requested-by-an-invalid-server/m-p/7425...

https://vox.veritas.com/t5/NetBackup/status-code-37-advanced-disk-backup-failed-Operation-requested/...

14 REPLIES 14

Aurelien59
Level 4

And now after rebooting both media and master servers I got this :

Capture3.JPG

Hi @Aurelien59 

The second image from your first post is indicating that the system is unable to see the disk volume. 

Are you able to see the contents of the MSDP volume - is it mount back onto the same drive/mountpoint as before?

Is the size of the volume consistent with what you are expecting? 

Can you perform a file system check on the voliume successfully?

Doesn't look good at the moment, but hopefully it will be something simple.

As an aside - this pool appears to be 94% full - which is never a great idea for a dedupe pool (best to keep [well] below 90%). 

Good luck
David

Hi @davidmoline 

"Are you able to see the contents of the MSDP volume - is it mount back onto the same drive/mountpoint as before?"

Yes and yes.

Aurelien59_0-1622460709347.png

 

 

"Is the size of the volume consistent with what you are expecting?"

I am not sure but I have the feeling that the volume used seems less important than before replacing the defective disk. I have no proof of the occupation of the storage before the crash.

Aurelien59_1-1622460755154.png

 

 

"Can you perform a file system check on the voliume successfully?"

I performed succesfully SFC /scannow with no error.

 

"As an aside - this pool appears to be 94% full - which is never a great idea for a dedupe pool (best to keep [well] below 90%)."

True, it is precisely one of the projects in progress for the weeks to come.

X2
Moderator
Moderator
   VIP   

Is it a communication error?

Run this from master server: bptestbpcd -client <media-server-name> -debug -verbose

What output does it give you?

 

here's the result of the command :

M:\Veritas\NetBackup\bin\admincmd>bptestbpcd -client srv-nbumed01-ft -debug -verbose
16:09:42.581 [9400.18876] <2> bptestbpcd: VERBOSE = 0
16:09:42.596 [9400.18876] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:09:42.596 [9400.18876] <2> logconnections: BPCD CONNECT FROM 123.1.3.184.29240 TO 123.8.54.180.1556 fd = 528
16:09:42.612 [9400.18876] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:09:42.643 [9400.18876] <8> do_pbx_service: [vnet_connect.c:2186] via PBX VNETD CONNECT FROM 123.1.3.184.29241 TO 123.8.54.180.1556 fd = 548
16:09:42.643 [9400.18876] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:455] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa
16:09:42.846 [9400.18876] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:480] ipc_string 62484
16:09:43.766 [9400.18876] <2> bpcr_get_version_rqst: bpcd version: 08000000
1 1 1
123.1.3.184:29240 -> 123.8.54.180:1556
123.1.3.184:29241 -> 123.8.54.180:1556
16:09:43.969 [9400.18876] <2> bpcr_get_peername_rqst: Server peername length = 26
16:09:44.172 [9400.18876] <2> bpcr_get_hostname_rqst: Server hostname length = 27
16:09:44.390 [9400.18876] <2> bpcr_get_clientname_rqst: Server clientname length = 27
16:09:44.609 [9400.18876] <2> bpcr_get_version_rqst: bpcd version: 08000000
16:09:44.812 [9400.18876] <2> bpcr_get_platform_rqst: Server platform length = 7
16:09:45.030 [9400.18876] <2> bpcr_get_version_rqst: bpcd version: 08000000
16:09:45.264 [9400.18876] <2> bpcr_patch_version_rqst: theRest == > <
16:09:45.264 [9400.18876] <2> bpcr_get_version_rqst: bpcd version: 08000000
16:09:45.685 [9400.18876] <2> bpcr_patch_version_rqst: theRest == > <
16:09:45.685 [9400.18876] <2> bpcr_get_version_rqst: bpcd version: 08000000
PEER_NAME = srv-nbumast-ai.process.dkm
HOST_NAME = srv-nbumed01-ft.process.dkm
CLIENT_NAME = srv-nbumed01-ft.process.dkm
VERSION = 0x08000000
PLATFORM = win_x64
PATCH_VERSION = 8.0.0.0
SERVER_PATCH_VERSION = 8.0.0.0
MASTER_SERVER = srv-nbumast-ai.process.dkm
EMM_SERVER = srv-nbumast-ai.process.dkm
NB_MACHINE_TYPE = MEDIA_SERVER
16:09:45.919 [9400.18876] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:09:45.950 [9400.18876] <8> do_pbx_service: [vnet_connect.c:2186] via PBX VNETD CONNECT FROM 123.1.3.184.29253 TO 123.8.54.180.1556 fd = 560
16:09:45.950 [9400.18876] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:455] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa
16:09:46.169 [9400.18876] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:480] ipc_string 62486
123.1.3.184:29253 -> 123.8.54.180:1556
<2>bptestbpcd: EXIT status = 0
16:09:46.590 [9400.18876] <2> bptestbpcd: EXIT status = 0

 

Looks like everything is OK here?

Hi @Aurelien59 

Comms looks fine as expected.

My reading of the situation is that the media server is unable to reattached the MSDP volume for some reason. 

Review these two article and see if all the various configuration settings in your systenm are correct 
https://www.veritas.com/support/en_US/article.100024630 
https://www.veritas.com/support/en_US/article.100032612 

In particular the contents of the registry key are what tells the media server where to look for the MSDP volume. 

Then the etc directory indicates how the volume is setup (directory locations etc.). 

Cheers
David

M:\Veritas\pdde>spad --test
Warning: 25002: __openReaderA: could not open file S:\Dedup_NBU\etc\puredisk\spa.cfg
Error: 25002:
BadValue : (none)
File : S:\Dedup_NBU\etc\puredisk\spa.cfg
Section : Logging
Entry : HistoryPath
Reason : Expected a non-empty value.
S:\Dedup_NBU\etc\puredisk\spa.cfg: 5 error(s)

 

Seeing this, checked in the path, and the spa.cfg was missing. So I copied and edited a spa.cfg from another media server. In registry everything was OK.

Contentrouter.cfg was also missing. Did the same.

M:\Veritas\pdde>spoold --trace
Error [0000000000265DA0]: -1: Failed to load storage format version from S:\Dedup_NBU\data\.format
Error [0000000000265DA0]: -1: The storage format file S:\Dedup_NBU\data\.format is lost or corrupted, please run the following command to fix it:
Error [0000000000265DA0]: -1: M:\Veritas\\pdde\stconv.exe --fixformatfile
Error [0000000000265DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000265DA0]: -1: NetConnectByAddr: Failed to connect to spad on port 10102 using the following interface(s): [ 123.8.54.180 ::1 ] (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur c
-92
Error [0000000000265DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000265DA0]: 25053: Connection failed connection actively refused
Error [0000000000265DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000265DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000265DA0]: 25053: Connection failed connection actively refused
Error [0000000000265DA0]: 26016: Storage Format: Check failure.

M:\Veritas\pdde>stconv.exe --fixformatfile

then...

M:\Veritas\pdde>spoold --trace
Error [00000000004B3390]: -1: _dcHeaderRead: invalid version of container header 2863311530
Error [00000000004B3390]: 25032: _storeCheckContainers: failed to read index headerfrom container 111615 (data corrupt)
Error [00000000004B3390]: -1: _dcHeaderRead: invalid version of container header 2863311530
Error [00000000004B3390]: 25032: _storeCheckContainers: failed to read index headerfrom container 135245 (data corrupt)
Error [00000000004B3390]: -1: _dcHeaderRead: invalid version of container header 2863311530
Error [00000000004B3390]: 25032: _storeCheckContainers: failed to read index headerfrom container 150585 (data corrupt)
[...]
Error [0000000000445DA0]: 25002: OpenId: Could not open S:\Dedup_NBU\spool\.tlogid to read the current tlogid.
Warning [0000000000445DA0]: 25004: Could not initialize performance counter for \Process(spoold)\% Processor Time (no such object)
Warning [0000000000445DA0]: 25004: Could not initialize performance counter for \Processor(_Total)\% Processor Time (no such object)
Warning [0000000000445DA0]: 25004: Could not initialize performance counter for \Processor(0)\% Processor Time (no such object)
Warning [0000000000445DA0]: 25004: Could not initialize performance counter for \Processor(1)\% Processor Time (no such object)
[...]
Warning [0000000000445DA0]: 25004: Could not initialize performance counter for \Processor(23)\% Processor Time (no such object)
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to spad on port 10102 using the following interface(s): [ 123.8.54.180 ::1 ] (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur
-92
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Warning [0000000000445DA0]: 25053: Failed to get startup CR modes from SPA after 1 attempt, retrying in 10 seconds
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Warning [0000000000445DA0]: 25053: Failed to get startup CR modes from SPA after 2 attempts, retrying in 10 seconds
[...]
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Warning [0000000000445DA0]: 25053: Failed to get startup CR modes from SPA after 8 attempts, retrying in 10 seconds
Error [00000000004B3390]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [00000000004B3390]: -1: NetConnectByAddr: Failed to connect to spad on port 10102 using the following interface(s): [ 123.8.54.180 ::1 ] (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur
-92
Error [00000000004B3390]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [00000000004B3390]: 25053: Connection failed connection actively refused
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Error [0000000000445DA0]: 25053: Failed to get startup CR modes from SPA
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused
Error [0000000000445DA0]: 26016: Configuration Manager: Start failure.
Error [0000000000445DA0]: -1: NetConnectByAddr: Failed to connect to host: Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. (10061)
Error [0000000000445DA0]: 25053: Could not establish a connection to SRV-NBUMED01-FT:10102: connect failed (Aucune connexion nÆa pu Ûtre Útablie car lÆordinateur cible lÆa expressÚment refusÚe. )
Error [0000000000445DA0]: 25053: Connection failed connection actively refused

Here is the result of "spad --test" command :

M:\Veritas\pdde>spad --test
S:\Dedup_NBU\etc\puredisk\spa.cfg: verified OK

nb : spa.cfg was missing so I copied one from another media server, and edited it. So now result seems good here. Did the same for contentrouter.cfg which was missing too

 

and an attached .txt file containing the result of "spoold --trace"

 

Hi @Aurelien59 

I'd suggest you log a support case - the output is indicating (rightly or not) that there is some possible corruption. The fact that some key files were missing is a concern.

Are you running dedupe catalog backups on that pool (if not why not)? You may be able to retrieve the original contents of the various pool configuration files from there.

David

 

Hi David,

 

We are not running dedupe catalog backups on that pool.

We are only running a CATALOG_DRIVEN_BACKUP policy. Maybe it is the same thing you'r talking about ?

Hi @Aurelien59 

The Dedupe catalog backup policy is not something that is auotmatically setup (unless you install a NetBackup appliance). It is mentioned in the Dedupe Guide but is not well known. The command that is used to create this policy is the drcontrol command (<INSTALL_PATH>\Veritas\pdde\drcontrol.exe or /usr/openv/pdde/pdcr/bin/drcontrol). 

This utility can be used to create a policy that protects critical files assocaited with the dedupe pool  (including the files you have had to recreate - spa.cfg & contentrouter.cfg). Run the command with no arguments will give you what options are available and where to run.

As I said before I'd strongly recommend opening a support case and have them look into recovering your dedupe pool. 

Cheers
David

Hi David,

Unfortunately we cannot benefit from the support because we are still using version 8.0.

We will have to plan an upgrade to 8.2.

Anyway thank you for your help.

Hi @Aurelien59 

NetBackup 8.0 is under extended support it is not unsupported, I would still attempt to log a case explaining that you want to upgrade to 8.2 (or higher), but have this as a block to that upgrade path. I can't guarantee that they will help you but it is worth the attempt.  

David

pats_729
Level 6
Employee

Hello @Aurelien59 

Error 37 is usually indicates a missing SERVER entry in bp.conf / Windows Registry. If this is the case then following commands should give us clear hints.

Possible to post output for following commands ?

From Master Server  --> bptestbpcd -host media-server -verbose

From Media Server --> bptestbpcd -host master-server -verbose

From Media Server --> bpclntcmd -pn -verbose -debug

From Media Server  --> bpclntcmd -self