cancel
Showing results for 
Search instead for 
Did you mean: 

Media Server showing as Client

PhilPruden
Level 3

NetBackup v8.2

Issue: Virtual media server appliance showing as 'Client' under Host Properties > Media Servers, disk are seen as Offline (not Active for Disk)

4x virtual media server appliances, appliances 1 and 2 performing as expected, applianace 3 and 4 seen as 'Client', disks offline.

Troublshooting Steps Taken:

1. bpdown, bpup on both master and media servers

2. Confirm media servers are listed in Additional and Media Servers under Master server properties > Servers

3. Check registry settings, identical between all four servers

4. nbemmcmd -listhosts lists all four media servers, hosts tagged as 'media', ndmp entries also exist for all four servers

5. Run command 'nbemmcmd -updatehost -machinename <mediaserver hostname> -machinetype media -machinestateop set_disk_active -masterserver <masterserver hostname>' nothing happens

6. Advanced Disk and Pure Disk pools exist for all four servers, all same type/size

7. bpgetconfig -g <Working Media Server> reports working media server as 'Media Host'

8. bpgetconfig -g <Broken Media Server> reports broken media server as 'Client of <NetBackup Master>'

If anyone has any suggestions that would be very much appreciated!

10 REPLIES 10

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please show us output of these commands on the master server:

nbemmcmd -listhosts -verbose

nbemmcmd -getemmserver

sdo
Moderator
Moderator
Partner    VIP    Certified

License keys added to all NetBackup Servers, and NetBackup Server application restarted on all ?

Hi, thanks for the responses...an update since my initial issue.

I managed to bring the 'Offline' servers online by going in to 'Host Properties > Media Servers' and going in to the Properties of each media server and adding the missing media servers to the 'Additional Servers' tab in Server properties.

I also at the same time removed an old media server that had been decommissioned.

The issue now is that after some time those servers have dissapeared from the Additional Servers section on each media server...and media server i deleted has returned!

So almost there...the issue now is, how do i make the changes i make stick!

sdo
Moderator
Moderator
Partner    VIP    Certified

If you selected multiple media servers at the same time then it is quite possible to that missing media servers appear to be in the list, so the best approach is to update each media server one at a time.

Use the command     nbemmcmd -listhosts     to check to see wheher the decomm'd media server is still known by the master server.  Other posts on this site will show how to delete a media server from EMM.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@PhilPruden 

When you make changes in Host Properties, then the Registry of Windows server and bp.conf of Linux/Unix server is updated.

If there are comms problems between the master and some media servers, then this update will not happen. 
This is why I asked you to run the nbemmcmd commands - this will test and confirm comms with each media server.
Important to run nbemmcmd -listhosts -verbose

If you have comms/network problems with some of the media servers, then no amount of Host Properties updates will fix that.

About the decommissioned media server - what other steps were taken to properly decommission it? 
Did you go through nbdecommission process? 

Hi Marianne,

So there's definitely an issue there when the media servers disappear from the 'Server' list on each of the other media servers. Here's a snippet with redaction's of the result of nbemmcmd -listhosts -verbose

<Broken Media Server>
ClusterName = ""
MachineName = "<Broken Media Server>"
FQName = "<Broken Media Server>"
LocalDriveSeed = ""
MachineDescription = ""
MachineFlags = 0x17
MachineNbuType = media (1)
MachineState = not reachable by master (4)
MasterServerName = "<Master Server>"
NetBackupVersion = 8.2.0.0 (820000)
OperatingSystem = windows (11)
ScanAbility = 5


<Working Media Server>
ClusterName = ""
MachineName = "<Working Media Server>"
FQName = "<Working Media Server>"
LocalDriveSeed = ""
MachineDescription = ""
MachineFlags = 0x17
MachineNbuType = media (1)
MachineState = active for disk jobs (12)
MasterServerName = "<Master Server>"
NetBackupVersion = 8.2.0.0 (820000)
OperatingSystem = windows (11)
ScanAbility = 5

This situation is reversed shortly after the 'Broken' media servers are re-added to the 'Server' list on all of the current media servers.

My questions the are:

1. If connectivity is lost to a media server, why is that media server then removed from the 'Server' list of the other media servers? Is that normal behaviour?

2. At what point in the proceedings, after NBU decides that it cannot talk to the Broken media servers, does it think to remove them and then re-add the decommissioned media server to the 'Server' list on other media servers? That does not make any logical sense. (NB a nbdecomm command against the decommissioned media server returns 'Found no valid machine types to decommission for <Decommissioned Server>)

To my mind those media server values are stored somewhere else and at X interval the values are 'refreshed' from that store i.e. those values are overwritten. That's why connectivity is lost, because the media servers have been forcibly removed...which follows as as soon as I add them back, within seconds, those media servers come back online.

Hi Marianne,

So there's definitely an issue there when the media servers disappear from the 'Server' list on each of the other media servers. Here's a snippet with redaction's of the result of nbemmcmd -listhosts -verbose

<Broken Media Server>
ClusterName = ""
MachineName = "<Broken Media Server>"
FQName = "<Broken Media Server>"
LocalDriveSeed = ""
MachineDescription = ""
MachineFlags = 0x17
MachineNbuType = media (1)
MachineState = not reachable by master (4)
MasterServerName = "<Master Server>"
NetBackupVersion = 8.2.0.0 (820000)
OperatingSystem = windows (11)
ScanAbility = 5


<Working Media Server>
ClusterName = ""
MachineName = "<Working Media Server>"
FQName = "<Working Media Server>"
LocalDriveSeed = ""
MachineDescription = ""
MachineFlags = 0x17
MachineNbuType = media (1)
MachineState = active for disk jobs (12)
MasterServerName = "<Master Server>"
NetBackupVersion = 8.2.0.0 (820000)
OperatingSystem = windows (11)
ScanAbility = 5

This situation is reversed shortly after the 'Broken' media servers are re-added to the 'Server' list on all of the current media servers.

My questions the are:

1. If connectivity is lost from the NBU Master to a media server, why is that media server then removed from the 'Server' list of the other media servers? Is that normal behaviour?

2. At what point in the proceedings, after NBU decides that it cannot talk to the Broken media servers, does it think to remove them and then re-add the decommissioned media server to the 'Server' list on other media servers? That does not make any logical sense. (NB a nbdecomm command against the decommissioned media server returns 'Found no valid machine types to decommission for <Decommissioned Server>)

To my mind those media server values are stored somewhere else and at X interval the values are 'refreshed' from that store i.e. those values are overwritten. That's why connectivity is lost, because the media servers have been forcibly removed...which follows as as soon as I add them back, within seconds, those media servers come back online.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

If this was my environment, I would spend all my time and energy to troubleshoot connectivity issues with the 'broken media servers'. 

MachineState = not reachable by master (4)

is a network connectivity issue.

The problem with your decommissioned media server is that some elements have been removed, not all. 
That is why nbdecommission should be used as 1st (and only) step to remove a media server. 
This command does all of the following: 
Decommission actions
https://www.veritas.com/content/support/en_US/doc/18716246-129889741-0/v95674727-129889741

About decommissioning a media server 
https://www.veritas.com/content/support/en_US/doc/18716246-129889741-0/v95673631-129889741

 

Phil,

Did you ever get to the bottom of this one?

I have exactly the same problem in a freshly built environment with a brand new Appliance that has just been added to a new Domain with a newly built Master server.

The nbemmcmd -listhosts -verbose gives this output for the new Appliance:

E:\>bptestbpcd -host <new appliance> -verbose
1 1 1
127.0.0.1:64330 -> 127.0.0.1:56694 PROXY 10.69.64.63:56692 -> 10.68.38.88:1556
127.0.0.1:64330 -> 127.0.0.1:56697 PROXY 10.69.64.63:56695 -> 10.68.38.88:1556
LOCAL_CERT_ISSUER_NAME = O=vx,OU=root@<new master server>,CN=broker
LOCAL_CERT_SUBJECT_COMMON_NAME = 31642869-f356-45b7-b5bc-db0c22737af2
PEER_CERT_ISSUER_NAME = O=vx,OU=root@<new master server>,CN=broker
PEER_CERT_SUBJECT_COMMON_NAME = 7cc4489d-6958-4b42-83d7-3b5ba58008fa
PEER_NAME = <client server name from another Domain>
HOST_NAME = <new appliance server name>
CLIENT_NAME = <new appliance server name>
VERSION = 0x08200000
PLATFORM = linuxR_x86_2.6.32
PATCH_VERSION = 8.2.0.0
SERVER_PATCH_VERSION = 8.2.0.0
MASTER_SERVER = <new Master server name>
EMM_SERVER = <new Master server name>
NB_MACHINE_TYPE = CLIENT
SERVICE_TYPE = VNET_DOMAIN_CLIENT_TYPE
PROCESS_HINT = 7cc4489d-6958-4b42-83d7-3b5ba58008fa

 

Not only is the NB_MACHINE_TYPE showing as CLIENT but also the PEER_NAME (which should be the Master server) is showing an client server name from a seperate Domain!!

I can't fathom where it has picked this up from or how to change it in the EMM database but I imagine that's my connectivity issue!

 

Hi

Firstly you did not provide the nbemmcmd output can you do this please?

In the menatime, log into the appliance and show us the appliance status - from the main menu Main -> Appliance -> Status. Then drop to an elevated shell and run the command "nbcertcmd -listCertDetails" and note the "Host ID" and compare to that obtained by the bptestbpcd command  LOCAL_CERT_SUBJECT_COMMON_NAME. If they don't match then there is something wrong with your name resolution. Also I note that the PEER_NAME returned by the appliance is not the master server name, does the new master server and the PEER_NAME returned have the same IP address?

If they do match, then aomething else is amiss and the nbemmcmd output may help.

Cheers
D.