Hey everyone, been a while since I've had to post here ;)
I have a very strange issue which is blocking our ability to restore some data.
We have an old NBU 18.104.22.168 system which has been in operation for many years.
This system is a VM master server, a few physical media servers, and a number of NetApp filers.
All systems aside from the master are plugged into a fiber channel SAN, and NDMP backups are run from the filers to the tape drive directly. The media servers do random standard backups of misc systems.
I have NDMP backups from these filers from a few months ago (March) which if we kick off a restore from, we immediately get the message 'Tape media server is not active'.
Here is a snip from the job Detailed Status in the NBU Java GUI:
06/03/2021 13:42:51 - begin Restore
06/03/2021 13:42:52 - number of images required: 1
06/03/2021 13:42:52 - media needed: E00128
06/03/2021 13:42:52 - media needed: E00108
06/03/2021 13:42:52 - media needed: E00113
06/03/2021 13:42:52 - media needed: E00120
06/03/2021 13:42:52 - restoring from image pd-dd-07_1615860482
06/03/2021 13:42:52 - Info bpbrm (pid=7975) pd-dd-02 is the host to restore to
06/03/2021 13:42:52 - Info bpbrm (pid=7975) telling media manager to start restore on client
06/03/2021 13:42:52 - Info bpbrm (pid=7975) spawning a brm child process
06/03/2021 13:42:52 - Info bpbrm (pid=7975) child pid: 7980
06/03/2021 13:42:52 - Info bpbrm (pid=7980) start tar on client
06/03/2021 13:42:52 - Info tar (pid=7984) Restore started.
06/03/2021 13:42:53 - requesting resource E00120
06/03/2021 13:42:53 - awaiting resource E00120. Waiting for resources.
Reason: Tape media server is not active, Media server: pd-nbumaster,
Robot Type(Number): TLD(2), Media ID: N/A, Drive Name: N/A,
Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A,
Disk Pool: N/A, Disk Volume: N/A
this restore forever hangs, never getting beyond this point.
I'm not quite sure why it's upset about this. The media server it is showing, pd-nbumaster, is in fact the VM master server, which has no drives or robots physically attached to it.
What is even more confusing about this, is that if we try to restore from more recent backups, say in the last month, they restore fine, though those restores all appear to use one of the physical media servers (not filers) as a media server for the restore. The server 'pd-ms2' in this case is one of the physical media servers:
06/03/2021 14:06:13 - begin Restore
06/03/2021 14:06:13 - number of images required: 1
06/03/2021 14:06:13 - media needed: E00561
06/03/2021 14:06:13 - media needed: E00574
06/03/2021 14:06:13 - media needed: E00566
06/03/2021 14:06:14 - restoring from image pd-dd-07_1622280123
06/03/2021 14:06:14 - Info bpbrm (pid=4466) pd-dd-02 is the host to restore to
06/03/2021 14:06:14 - Info bpbrm (pid=4466) telling media manager to start restore on client
06/03/2021 14:06:14 - Info bpbrm (pid=4466) spawning a brm child process
06/03/2021 14:06:14 - Info bpbrm (pid=4466) child pid: 4469
06/03/2021 14:06:15 - Info bpbrm (pid=4469) start tar on client
06/03/2021 14:06:15 - Info tar (pid=4472) Restore started.
06/03/2021 14:06:15 - Info bptm (pid=4468) Waiting for mount of media id E00574 (copy 1) on server pd-ms2.
06/03/2021 14:06:15 - started process bptm (pid=4468)
06/03/2021 14:06:15 - mounting E00574
06/03/2021 14:06:15 - Info bptm (pid=4468) INF - Waiting for mount of media id E00574 on server pd-ms2 for reading.
06/03/2021 14:06:18 - Info ndmpagent (pid=4472) pd-dd-02: Session identifier: 14867
06/03/2021 14:06:15 - requesting resource E00574
06/03/2021 14:06:15 - granted resource E00574
06/03/2021 14:06:15 - granted resource IBM.ULTRIUM-TD8.005
this restore goes on to succeed.
I have tried a few things to force the media server as described here:
I've also rebooted all systems involved at least once.
However nothing has changed. Any time I try to NDMP restore from these older backups, it always wants to use the master server as the media server. Restores from newish backups seem to run fine, using the physical media servers.
It looks like media / image ownership has changed to a server which probably do not have tape drives installed..
You can simply change ownership of Tapes to the appropriate server with following commands
bpimage -newserver SERVER_WITH_TAPEDRIVE -oldserver oldserver_name -id MEDIA_ID
bpmedia -movedb -m media_id -newserver SERVER_WITH_TAPEDRIVE -oldserver oldservername