cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup 7.5 Media Server Cluster in an AIX HACMP environment

Volker_Spies
Level 3

Hello all,

 

I have a few questions about a medi server cluster we configured recently.

When we move the apllication (DB2/SAP in HACMP cluster) from one node to the other node, it takes NetBackup up to 40 minutes to register that the application moved and to update the media server.

I can see that in /usr/openv/volmgr/debug/daemon

application switch was at 15:05

node1, now inactive noticed the switch at 15:18

<snip>

 

15:13:46.321 [8323148] <2> parsePatchVersionString: theRest = ><
15:13:46.327 [8323148] <4> UpdateClusterActiveNode: node1 is not part of a cluster
15:13:46.347 [8323148] <2> SetApplicationClusterStatus: Started
15:13:46.381 [8323148] <4> SetApplicationClusterStatus: application cluster <clsuter> is active on <node1>
15:13:46.395 [8323148] <2> SetApplicationClusterStatus: Done
15:18:46.499 [8323148] <2> mm_getnodename: (0) hostname node1 (from cached_hostname)
15:18:46.827 [8323148] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version
15:18:46.827 [8323148] <2> parsePatchVersionString: parsing = >7.5.0.3
<
15:18:46.827 [8323148] <2> parsePatchVersionString: theRest = ><
15:18:46.827 [8323148] <4> UpdateClusterActiveNode: node1 is not part of a cluster
15:18:46.827 [8323148] <2> SetApplicationClusterStatus: Started
15:18:46.974 [8323148] <4> SetApplicationClusterStatus: application cluster <cluster> is NOT active on <node1>
15:18:46.983 [8323148] <2> SetApplicationClusterStatus: Done
15:23:47.180 [8323148] <2> mm_getnodename: (0) hostname node1 (from cached_hostname)
15:23:47.208 [8323148] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version
15:23:47.212 [8323148] <2> parsePatchVersionString: parsing = >7.5.0.3
<snip>
 
node2, now active noticed the switch at 15:45
 
15:40:45.941 [8716342] <2> parsePatchVersionString: theRest = ><
15:40:45.941 [8716342] <4> UpdateClusterActiveNode: node2 is not part of a cluster
15:40:45.941 [8716342] <2> SetApplicationClusterStatus: Started
15:40:45.963 [8716342] <4> SetApplicationClusterStatus: application cluster <cluster> is NOT active on <node2>
15:40:45.974 [8716342] <2> SetApplicationClusterStatus: Done
15:45:46.068 [8716342] <2> mm_getnodename: (0) hostname node2 (from cached_hostname)
15:45:46.068 [8716342] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version
15:45:46.068 [8716342] <2> parsePatchVersionString: parsing = >7.5.0.3
<
15:45:46.068 [8716342] <2> parsePatchVersionString: theRest = ><
15:45:46.068 [8716342] <4> UpdateClusterActiveNode: node2 is not part of a cluster
15:45:46.068 [8716342] <2> SetApplicationClusterStatus: Started
15:45:46.149 [8716342] <4> SetApplicationClusterStatus: application cluster <cluster> is active on <node2>
15:45:46.202 [8716342] <2> SetApplicationClusterStatus: Done
15:50:46.292 [8716342] <2> mm_getnodename: (0) hostname node2 (from cached_hostname)
15:50:46.292 [8716342] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version
15:50:46.292 [8716342] <2> parsePatchVersionString: parsing = >7.5.0.3
 
So it took NetBackup 40 minutes to recognise the switch. That seems a little bit to long?
Or is that normal behavior? What criteria does netbackup use to determine if the cluster is active on one node?
 
In addition to that I get a error message in this logs right after the switch:
 
<8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=112 FILE=/usr/openv/var/host_cache/073/73053c73+0,1,41,0,1,0+10.1.30.190.txt
<8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=112 FILE=/usr/openv/var/host_cache/0a8/747370a8+0,1,41,0,1,0+10.10.30.190.txt
 
Is this something to worrry about? The IP adress 10.1.30.190 is the cluster IP, that switched with the cluster app.
 
Maybe there is a way to include the NetBackup UpdateClusterActiveNode Commands into the scripts that run during the switch, to speed up things?
 
Thanks in advance
 
Volker
 
 
 
 

 

2 REPLIES 2

V4
Level 6
Partner Accredited

Do you have Media Server CLUSTERED????

Were they upgraded from 6.x ?

Effective 7.x NBU had stopped suppoting Clustered Media server configurations. Only Master Server cluster are supported.

Was it working earlier or is it intermittent issue you are facing ?

Marianne
Level 6
Partner    VIP    Accredited Certified

Seems you have configured ApplicationCluster?

Try to refresh host cache on master and media servers after cluster has switched to other node:

bpclntcmd -clear_host_cache