SQL cluster backup error: no storage units available for use(213)
Hi,
I'm having trouble with backups on a new SQL cluster. Here are the details of my setup:
Two physical nodes configured as media servers: swt003 and swt004
These are just straight installs of NBU 7.5.0.6, client has browser and client name set to the server name
We use Data Domain via the DDBoost plugin, registered as the server name e.g: nbdevconfig -creatests -stype DataDomain -storage_server basdd002.dcsit.net -media_server swt003
The clustering itself:
Virtual server name: SWD005
SQL cluster name: DISQLPROD
The actual SQL instance name is also DISQLPROD, so for users to connect via SQL from a remote machine they would use DISQLPROD\DISQLPROD.
Both media servers and the master have hosts entries for the physical nodes, the virtual cluster SWD005 and the sql cluster DISQLPROD. I've double-checked and they are all correct and consistent.
When I login to the virtual cluster swd005 and run the Netbackup SQL Client (I am logged on with a domain account that is a sysadmin of the bisqlprod database and the sql account I use "netbackup" is a sysadmin), I cannot connect to the SWD005 default instance. I can only connect to the DISQLPRO\DISQLPROD instance. So I then use that to create a "Backup All" script.
Given that i can only access the cluster as DISQLPROD I created a "DISQLPROD" app_cluster object and added the hostnames of the two physical nodes as members.
I then created a new storage unit called NLDD002_DFS_DISQLPROD in our usual way for Data Domain and added bisqlprod as the media server.
The SQL backup job has DISQLPROD as the client, runs the script from c:\program files\veritas etc and writes directly to the storage unit created above with DISQLPROD (the app_cluster object) as the media server.
But the backup fails immediately:
25/05/2016 17:51:40 - Info nbjm(pid=4620) starting backup job (jobid=3419675) for client bisqlprod, policy DFS_DBSQL_DISQLPROD_TEST, schedule Daily_Full
25/05/2016 17:51:40 - Info nbjm(pid=4620) requesting MEDIA_SERVER_WITH_ATTRIBUTES resources from RB for backup job (jobid=3419675, request id:{BCEC00FA-DE57-4140-83D3-A19EB73C52DE})
25/05/2016 17:51:40 - requesting resource NLDD002_DFS_DISQLPROD
25/05/2016 17:51:40 - requesting resource swm009.corpadds.net.NBU_CLIENT.MAXJOBS.bisqlprod
25/05/2016 17:51:40 - requesting resource swm009.corpadds.net.NBU_POLICY.MAXJOBS.DFS_DBSQL_DISQLPROD_TEST
25/05/2016 17:51:41 - Error nbjm(pid=4620) NBU status: 213, EMM status: Storage units are not available
no storage units available for use(213)
What's going on here, any suggestions? I thought from the documentation that I was supposed to be able to address the database using the virtual cluster name swd005 and the default instance. I am using the SQL instance name instead, but it works ok - you can RDP to the virtual cluster server using DISQLPROD as the server name, just as you can map a unc path as e.g. \\disqlprod\c$.
As the app_cluster name is set to "disqlprod" and that's what's associated with the storage unit, I assume that something is missing that chains that app_cluster to the actual physical nodes. But the correct hosts are members of the app-cluster group:
E:\Program Files\Veritas\NetBackup\bin\admincmd>nbemmcmd -listhosts -list_servers_in_app_cluster -clustername disqlprod
NBEMMCMD, Version: 7.5.0.6
The following hosts were found:
media swt003
media swt004
Command completed successfully.
Any help gratefully received :-)
Ooh that worked!
On the storage unit I didn't select "use any available media server", I just selected both physical nodes. The backup succeeded and used swt004, which is currently the active cluster node:
26/05/2016 12:11:28 - Info nbjm(pid=4620) starting backup job (jobid=3426184) for client swd005, policy DFS_FS_SWD005_TEST, schedule Daily_Full
26/05/2016 12:11:28 - estimated 0 Kbytes needed
26/05/2016 12:11:28 - Info nbjm(pid=4620) started backup (backupid=swb005_1464261088) job for client swd005, policy DFS_FS_SWD005_TEST, schedule Daily_Full on storage unit NLDD002_DFS_SWD005
26/05/2016 12:11:30 - started process bpbrm (12508)
26/05/2016 12:11:31 - Info bpbrm(pid=12508) swd005 is the host to backup data from
26/05/2016 12:11:31 - Info bpbrm(pid=12508) reading file list from client
26/05/2016 12:11:31 - Info bpbrm(pid=12508) starting bpbkar32 on client
26/05/2016 12:11:31 - connecting
26/05/2016 12:11:31 - connected; connect time: 00:00:00
26/05/2016 12:11:34 - Info bpbkar32(pid=2312) Backup started
26/05/2016 12:11:34 - Info bptm(pid=7300) start
26/05/2016 12:11:43 - Info bptm(pid=7300) using 1048576 data buffer size
26/05/2016 12:11:43 - Info bptm(pid=7300) setting receive network buffer to 4195328 bytes
26/05/2016 12:11:43 - Info bptm(pid=7300) using 512 data buffers
26/05/2016 12:11:45 - Info bpbkar32(pid=2312) change journal NOT enabled for <E:\DaimlerBI Badfiles>
26/05/2016 12:11:46 - Info bptm(pid=7300) start backup
26/05/2016 12:11:47 - begin writing
26/05/2016 12:12:09 - Info bptm(pid=7300) waited for full buffer 232 times, delayed 1383 times
26/05/2016 12:12:09 - Info bpbkar32(pid=2312) bpbkar waited 0 times for empty buffer, delayed 0 times.
26/05/2016 12:12:10 - Info bptm(pid=7300) EXITING with status 0 <----------
26/05/2016 12:12:10 - Info bpbrm(pid=12508) validating image for client swd005
26/05/2016 12:12:11 - end writing; write time: 00:00:24
26/05/2016 12:12:12 - Info bpbkar32(pid=2312) done. status: 0: the requested operation was successfully completed
the requested operation was successfully completed(0)
I guess I need to fail the cluster over and confirm that when swt003 is the active node it is also automatically selected to perform the backup.