11-26-2014 12:57 AM
Hello community,
We are having an issue with doing restores at the moment, an example log can be shown below.
11/25/2014 15:17:20 - begin Restore
11/25/2014 15:17:21 - number of images required: 1
11/25/2014 15:17:22 - media needed: MF071A
11/25/2014 15:17:22 - media needed: MF036A
11/25/2014 15:17:26 - restoring from image gandalf-backup_1194765667
11/25/2014 15:17:26 - Info bpbrm (pid=7049) backup is the host to restore to
11/25/2014 15:17:27 - Info bpbrm (pid=7049) telling media manager to start restore on client
11/25/2014 15:17:27 - Info bpbrm (pid=7049) spawning a brm child process
11/25/2014 15:17:27 - Info bpbrm (pid=7049) child pid: 7057
11/25/2014 15:17:27 - connecting
11/25/2014 15:17:27 - Info bpbrm (pid=7057) start tar on client
11/25/2014 15:17:27 - Info tar (pid=7063) Restore started.
11/25/2014 15:17:27 - connected; connect time: 0:00:00
11/25/2014 15:17:27 - requesting resource MF071A
11/25/2014 15:17:27 - Error nbjm (pid=4965) NBU status: 830, EMM status: No drives are available
11/25/2014 15:17:27 - Error nbjm (pid=4965) NBU status: 830, EMM status: No drives are available
11/25/2014 15:17:28 - Error bptm (pid=7052) NBJM returned an extended error status: No drives are available (2001)
11/25/2014 15:17:33 - Info bpbrm (pid=7049) got ERROR 252 from media manager
11/25/2014 15:17:33 - Info bpbrm (pid=7049) terminating bpbrm child 7057 jobid=387452
11/25/2014 15:17:33 - restored from image gandalf-backup_1194765667; restore time: 0:00:07
11/25/2014 15:17:33 - Warning bprd (pid=7030) Restore must be resumed prior to first image expiration on INFINITY
11/25/2014 15:17:33 - end Restore; elapsed time 0:00:13
Failed to get status code information (2826)
Aditional info:
Issue started after we relocated server room, everything was unplugged/powered down then setup the same.
Media ID: MF071A, Barcode: MF071A, Density: hcart, Access Mode: Read,
/usr/openv/volmgr/bin/tpconfig -d
Id DriveName Type Residence
Drive Path Status
****************************************************************************
0 TD01 hcart TLD(0) DRIVE=1
/dev/nst0 UP
1 TD02 hcart TLD(0) DRIVE=2
/dev/nst1 UP
Currently defined robotics are:
TLD(0) robotic path = /dev/sg2
EMM Server = backup
NetBackup is running on a linux server and I'm running the Admin console from my Windows 7 Pro PC.
And if it makes any difference, it throws up the errors before tapes are even inserted into the drives, and again once inserted and imported.
Any help would be much appreciated.
Thanks!
Solved! Go to Solution.
12-01-2014 03:12 PM
I have just noticed the following (not the cause of this issue):
CLIENT_NAME = bilbo
CLIENT_NAME = gandalf-backup
CLIENT_NAME = larry
CLIENT_NAME = morris
CLIENT_NAME = haldir
The ONLY CLIENT_NAME in bp.conf on a master must be itself.
Please delete ALL of the above entries and just add one CLIENT_NAME entry:
CLIENT_NAME = backup
Please ensure that all of these log folders exist on the master (which is also the media server here):
bprd bptm and bpbrm.
Copy log files in above folders to .txt files (e.g. bprd.txt) and upload as File attachments.
11-26-2014 01:11 AM
Halogen, stop all jobs from running then type:
nbrbutil -resetAll
Note that this command will kill any jobs currently running.
This will clear out ALL allocations as the drives may be reserved.
If that fails to work, reset the drives by power cycling them (physically pull the power out of them).
11-26-2014 02:00 AM
Please verify MF071A is really a HCART media type by running:
nbemmcmd -listmedia -mediaid MF071A
Look for the "Media Type". The error usally show when there is no compatible densities configured. E.g media type is HCART1 but drives are configured as HCART2
11-26-2014 04:18 AM
Issue started after we relocated server room, everything was unplugged/powered down then setup the same
After relocation have you setup the device configuration again? What exact steps you took?
Restart the NBU services once if you have just unplugged and plugged everything.
provide the output of nbemmcmd -listhosts -verbose
11-26-2014 09:33 AM
If you simply relocated and nothing changed then nothing should need reconfiguring, so sounds like youve misconfigged it (physically) and/or then tried to fix it (possibly unneccesarily) with some netbackup reconfiguration, so I would check and double check the connections, check the hcart logic vs drive logic as Nicolai alluded to. Moving it from A to B and reassembling as was cannot make Netbackup behave differently.
Maybe theres something else been added to the mix?
Jim
11-26-2014 10:32 AM
Cannot be too much of an emergency... No response in many hours....
11-26-2014 05:16 PM
Hi all,
Thanks for your response, I'm going to try the suggestions and get back to you all.
We are over in Aus so I was just leaving the office when I posted this.
Thanks
Halogen
11-26-2014 05:56 PM
Hi mate
Thanks for your response
We stopped all active jobs in the queue and the status was all "Done"
We then ran nbrbutil -resetAll and assume it succesfully completed (there was no output in the temrinal, just returned to the same path when the command was ran).
Once we then attempted another restore, the following log output:
11/27/2014 09:50:16 - begin Restore
11/27/2014 09:50:17 - number of images required: 1
11/27/2014 09:50:17 - media needed: MF071A
11/27/2014 09:50:17 - media needed: MF036A
11/27/2014 09:50:21 - restoring from image gandalf-backup_1194765667
11/27/2014 09:50:22 - Info bpbrm (pid=21771) backup is the host to restore to
11/27/2014 09:50:22 - Info bpbrm (pid=21771) telling media manager to start restore on client
11/27/2014 09:50:22 - Info bpbrm (pid=21771) spawning a brm child process
11/27/2014 09:50:22 - Info bpbrm (pid=21771) child pid: 21779
11/27/2014 09:50:22 - connecting
11/27/2014 09:50:22 - Info bpbrm (pid=21779) start tar on client
11/27/2014 09:50:22 - Info tar (pid=21786) Restore started.
11/27/2014 09:50:22 - connected; connect time: 0:00:00
11/27/2014 09:50:23 - Error bptm (pid=21774) NBJM returned an extended error status: No drives are available (2001)
11/27/2014 09:50:23 - requesting resource MF071A
11/27/2014 09:50:23 - Error nbjm (pid=4965) NBU status: 830, EMM status: No drives are available
11/27/2014 09:50:23 - Error nbjm (pid=4965) NBU status: 830, EMM status: No drives are available
11/27/2014 09:50:28 - Info bpbrm (pid=21771) got ERROR 252 from media manager
11/27/2014 09:50:28 - Info bpbrm (pid=21771) terminating bpbrm child 21779 jobid=387503
11/27/2014 09:50:28 - restored from image gandalf-backup_1194765667; restore time: 0:00:07
11/27/2014 09:50:28 - Warning bprd (pid=21755) Restore must be resumed prior to first image expiration on INFINITY
11/27/2014 09:50:29 - end Restore; elapsed time 0:00:13
Failed to get status code information (2826)
11-26-2014 06:01 PM
Hiya mate,
We've reset the robot, NBU server many times, each time everything connects up fine.
Here is the output of the nbemmcmd -listhosts -verbose command
[root@backup admincmd]# /usr/openv/netbackup/bin/admincmd/nbemmcmd -listhosts -verbose
NBEMMCMD, Version:7.1.0.4
The following hosts were found:
backup
MachineName = "backup"
FQName = "backup.mf.com.au"
MachineDescription = ""
MachineNbuType = server (6)
backup
ClusterName = ""
MachineName = "backup"
FQName = "backup.mf.com.au"
GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
LocalDriveSeed = ""
MachineDescription = ""
MachineFlags = 0x64
MachineNbuType = master (3)
MachineState = active for tape and disk jobs (14)
NetBackupVersion = 7.1.0.4 (710400)
OperatingSystem = linux (16)
ScanAbility = 5
groucho
MachineName = "groucho"
FQName = "groucho.mf.com.au"
MachineDescription = ""
MachineNbuType = virtual_machine (10)
chico
MachineName = "chico"
FQName = "chico.mf.com.au"
MachineDescription = ""
MachineNbuType = virtual_machine (10)
harpo
MachineName = "harpo"
FQName = "harpo.mf.com.au"
MachineDescription = ""
MachineNbuType = virtual_machine (10)
samwise.mf.com.au
MachineName = "samwise.mf.com.au"
FQName = "samwise.mf.com.au"
MachineDescription = ""
MachineNbuType = virtual_machine (10)
Command completed successfully.
[root@backup admincmd]#
11-26-2014 07:56 PM
Nicolai asked you to check media output with:
nbemmcmd -listmedia -mediaid MF071A
(listmedia, not listhosts....)
This will tell us 2 things:
1. The density of the media
2. The media server that wrote the backup
You then need to check drive availability on this media server with 'tpconfig -l' and 'vmoprcmd -d'
and ensure that there are 'UP' drives matching the media density.
11-26-2014 09:00 PM
Hi mate,
We actually have replied, but the forum has published our posts in a moderation queue so they are waiting to be posted. If this is something you can action that would be very helpful.
11-26-2014 09:04 PM
It was the 'nbemmcmd -listhosts -verbose' output that was quarantined. I have published it in the meantime.
There are no more unpublished replies from you. (You can PM the community admin or one of the TA's when this happens.)
Please post the 'listmedia' output along with 'tpconfig' and 'vmoprcmd' commands on the relevant media server as per my post above.
11-26-2014 09:31 PM
Hi Marianne,
Please see below
nbemmcmd -listmedia -mediaid MF071A
NBEMMCMD, Version:7.1.0.4
====================================================================
Media GUID: 4b066366-211a-14e9-8004-a0986c9a69b7
Media ID: MF071A
Partner: -
Media Type: HCART
Volume Group: 000_00000_TLD
Application: Netbackup
Media Flags: 1
Description: ---
Barcode: MF071A
Partner Barcode: --------
Last Write Host: backup
Created: 11/05/2007 13:31
Time Assigned: 11/09/2007 16:40
First Mount: 11/09/2007 16:41
Last Mount: 05/09/2008 11:58
Volume Expiration: -
Data Expiration: INFINITY
Last Written: 11/12/2007 06:36
Last Read: 05/09/2008 11:58
Robot Type: TLD
Robot Control Host: backup
Robot Number: 0
Slot: 1
Side/Face: -
Cleanings Remaining: -
Number of Mounts: 11
Maximum Mounts Allowed: 0
Media Status: SUSPENDED FULL MPX
Kilobytes: 1093470079
Images: 41
Valid Images: 41
Retention Period: 9
Number of Restores: 1
Optical Header Size Bytes: 1024
Optical Sector Size Bytes: 0
Optical Partition Size Bytes: 0
Last Header Offset: 1693881
Adamm Guid: 00000000-0000-0000-0000-000000000000
Rsm Guid: 00000000-0000-0000-0000-000000000000
Origin Host: NONE
Master Host: backup
Server Group: NO_SHARING_GROUP
Upgrade Conflicts Flag:
Pool Number: 7
Volume Pool: EOM
Previous Pool Name: ScratchPool
Vault Flags: -
Vault Container: -
Vault Name: MyVault
Vault Slot: 46
Session ID: 144
Date Vaulted: 11/15/2007 15:28
Return Date: -
====================================================================
Command completed successfully.
[root@backup ~]#
________________________________________________________________
tpconfig -l
Device Robot Drive Robot Drive Device Second
Type Num Index Type DrNum Status Comment Name Path Device Path
robot 0 - TLD - - - - /dev/sg2
drive - 0 hcart 1 UP - TD01 /dev/nst0
drive - 1 hcart 2 UP - TD02 /dev/nst1
[root@backup ~]#
________________________________________________________________
vmoprcmd -d
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId
0 hcart TLD - No - 0
1 hcart TLD - No - 0
ADDITIONAL DRIVE STATUS
Drv DriveName Shared Assigned Comment
0 TD01 No -
1 TD02 No -
[root@backup ~]#
11-26-2014 09:49 PM
Hi Marianne, I've replied it again but was queued for moderation, can you see?
Thanks!
11-26-2014 10:21 PM
Last write host is 'backup'.
Did you run tpconfig and vmoprcmd on this media server?
11-26-2014 10:30 PM
Hi Marianne
Thanks for your help, all the commands issued have been from the host 'backup'
We don't have any other hosts - this is the master netbackup server and attached to the robot tape library.
[root@backup ~]# hostname
backup
[root@backup ~]# uname -a
Linux backup 2.6.9-103.ELsmp #1 SMP Fri Nov 11 14:28:16 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
11-26-2014 10:49 PM
Intersting. Ya as per nbemmcmd -listhosts you have just the master server with library.
The density of drive and media is same.Last write host is also master server.
Last Write Host: backup
Try a backup to see if it works and picks the drives. If it does, enable the nbrb and do the restore again and post the output. Also when you do the restore just check if it picks the master server to restore the data not some other media server. As last write host is master server it should pickup master but worth a check.
11-26-2014 11:33 PM
Last bit of output:
Show us output of the MDS Allocations section of 'nbrbutil -dump'.
Another thought: Is there maybe a FORCE_RESTORE_MEDIA...... entry in bp.conf on the master?
Next thing to do is to check logs.
On master: bprd (NBU must be restarted after log folder is created).
This will tell us which media server is chosen for the restore.
We then need to look at bptm log on the media server.
11-26-2014 11:37 PM
[root@backup ~]# /usr/openv/netbackup/bin/admincmd/nbrbutil -dump
Allocation Requests
(AllocationRequestSeq )
Allocations
(AllocationSeq )
MDS allocations in EMM:
[root@backup ~]#
There is no force_restore_media in bp.conf, here is an output of the file:
[root@backup netbackup]# cat bp.conf
SERVER = backup
SERVER = samwise
CLIENT_NAME = bilbo
CLIENT_NAME = gandalf-backup
CLIENT_NAME = larry
CLIENT_NAME = morris
CLIENT_NAME = haldir
EMMSERVER = backup
REQUIRED_INTERFACE = backup
VXDBMS_NB_DATA = /usr/openv/db/data
VERBOSE = 0
SERVER_SENDS_MAIL = NO
KEEP_VAULT_SESSIONS_DAYS = 45
CLIENT_READ_TIMEOUT = 7200
JOB_PRIORITY = 0 0 90000 90000 90000 90000 85000 85000 80000 80000 80000 80000 75000 75000 70000 70000 50000 50000 0 0 0 0 0 0
CLIENT_CONNECT_TIMEOUT = 1800
BPSTART_TIMEOUT = 1800
BPEND_TIMEOUT = 1800
OPS_CENTER_SERVER_NAME = samwise
VM_PROXY_SERVER = samwise
[root@backup netbackup]#
Can you advise on how we can check bprd and bptm logs once we try another restore (log file folders location?)
11-26-2014 11:57 PM
I still feel that you should try the backup first to isolate if it is a drive issue or restore issue. If the backups working on these drive it is something in the restore causing it not to pick the drives.