Netbackup 7.5 Media Server Cluster in an AIX HACMP...

Volker_Spies · ‎11-08-2012

Hello all,

I have a few questions about a medi server cluster we configured recently.

When we move the apllication (DB2/SAP in HACMP cluster) from one node to the other node, it takes NetBackup up to 40 minutes to register that the application moved and to update the media server.

I can see that in /usr/openv/volmgr/debug/daemon

application switch was at 15:05

node1, now inactive noticed the switch at 15:18

<snip>

15:13:46.321 [8323148] <2> parsePatchVersionString: theRest = ><

15:13:46.327 [8323148] <4> UpdateClusterActiveNode: node1 is not part of a cluster

15:13:46.347 [8323148] <2> SetApplicationClusterStatus: Started

15:13:46.381 [8323148] <4> SetApplicationClusterStatus: application cluster <clsuter> is active on <node1>

15:13:46.395 [8323148] <2> SetApplicationClusterStatus: Done

15:18:46.499 [8323148] <2> mm_getnodename: (0) hostname node1 (from cached_hostname)

15:18:46.827 [8323148] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version

15:18:46.827 [8323148] <2> parsePatchVersionString: parsing = >7.5.0.3

<

15:18:46.827 [8323148] <2> parsePatchVersionString: theRest = ><

15:18:46.827 [8323148] <4> UpdateClusterActiveNode: node1 is not part of a cluster

15:18:46.827 [8323148] <2> SetApplicationClusterStatus: Started

15:18:46.974 [8323148] <4> SetApplicationClusterStatus: application cluster <cluster> is NOT active on <node1>

15:18:46.983 [8323148] <2> SetApplicationClusterStatus: Done

15:23:47.180 [8323148] <2> mm_getnodename: (0) hostname node1 (from cached_hostname)

15:23:47.208 [8323148] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version

15:23:47.212 [8323148] <2> parsePatchVersionString: parsing = >7.5.0.3

<snip>

node2, now active noticed the switch at 15:45

15:40:45.941 [8716342] <2> parsePatchVersionString: theRest = ><

15:40:45.941 [8716342] <4> UpdateClusterActiveNode: node2 is not part of a cluster

15:40:45.941 [8716342] <2> SetApplicationClusterStatus: Started

15:40:45.963 [8716342] <4> SetApplicationClusterStatus: application cluster <cluster> is NOT active on <node2>

15:40:45.974 [8716342] <2> SetApplicationClusterStatus: Done

15:45:46.068 [8716342] <2> mm_getnodename: (0) hostname node2 (from cached_hostname)

15:45:46.068 [8716342] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version

15:45:46.068 [8716342] <2> parsePatchVersionString: parsing = >7.5.0.3

<

15:45:46.068 [8716342] <2> parsePatchVersionString: theRest = ><

15:45:46.068 [8716342] <4> UpdateClusterActiveNode: node2 is not part of a cluster

15:45:46.068 [8716342] <2> SetApplicationClusterStatus: Started

15:45:46.149 [8716342] <4> SetApplicationClusterStatus: application cluster <cluster> is active on <node2>

15:45:46.202 [8716342] <2> SetApplicationClusterStatus: Done

15:50:46.292 [8716342] <2> mm_getnodename: (0) hostname node2 (from cached_hostname)

15:50:46.292 [8716342] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version

15:50:46.292 [8716342] <2> parsePatchVersionString: parsing = >7.5.0.3

So it took NetBackup 40 minutes to recognise the switch. That seems a little bit to long?

Or is that normal behavior? What criteria does netbackup use to determine if the cluster is active on one node?

In addition to that I get a error message in this logs right after the switch:

<8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=112 FILE=/usr/openv/var/host_cache/073/73053c73+0,1,41,0,1,0+10.1.30.190.txt

<8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=112 FILE=/usr/openv/var/host_cache/0a8/747370a8+0,1,41,0,1,0+10.10.30.190.txt

Is this something to worrry about? The IP adress 10.1.30.190 is the cluster IP, that switched with the cluster app.

Maybe there is a way to include the NetBackup UpdateClusterActiveNode Commands into the scripts that run during the switch, to speed up things?

Thanks in advance

Volker

V4 · ‎01-20-2013

Do you have Media Server CLUSTERED????

Were they upgraded from 6.x ?

Effective 7.x NBU had stopped suppoting Clustered Media server configurations. Only Master Server cluster are supported.

Was it working earlier or is it intermittent issue you are facing ?

Marianne · ‎01-20-2013

Seems you have configured ApplicationCluster?

Try to refresh host cache on master and media servers after cluster has switched to other node:

bpclntcmd -clear_host_cache

Handy NetBackup Links

VOX

Netbackup 7.5 Media Server Cluster in an AIX HACMP environment