Netbackup Disk Backup Failover Issue with error 83
Hi all,
We are experiencing an unusual issue that I've spent a few days battling to no avail.
We currently run Netbackup ES 7.1 running on Solaris which speaks to multiple Linux Netbackup 7.6.0.3 Media servers. These Media servers rotate External HDD each month in a 4 monthly cycle.
For this we created 4 storage units, 1 for each disk and a storage unit group and set that group to failover.
On 06 Jan 2023 for our entire estate of Media servers (41 in total) all backups started to get stuck and show "Reason: Disk media server is not active, Media server: <MEDIA SERVER NAME>"
I've removed the media server name for security.
I'm new to this disk backup server configuration and so worked through some of the forums here where I found the following and applied it
/usr/openv/netbackup/bin/admincmd/nbemmcmd -updatehost -machinename <MEDIA SERVER> -machinetype media -machinestateop set_disk_active -masterserver <MASTER SERVER>
This initially appears to solve the issue however I'm now left with an issue with the failover aspect.
If I set the current disk connected to the media server first in the list in the storage Unit Group, it finds the disk and backups as normal. If I move that disk to any other position other than first position it fails with error 83 and it looks like the failover is not working.
When I check the disk status using nbdevquery it shows that all disks are up and I can set all but the current active disk to down and again this appears to work fine regardless of the position in the storage unit group.
From reading it appears that all but the current active disk should be in a down state in order for MDS to locate the correct disk to write to however that status is not changing when we conduct a disk change. This has worked fine for many years and has just stopped all of a sudden.
Our current workaround involves setting all disks to UP and moving the current connected disk to position 1 in the storage unit group.
Any help on the automatic updated and failover issue on this would be greatly appreciated.