12-14-2011 07:18 AM
At 8 am each morning we attempt to freeze and eject all tapes written in the last 24 hours so they can be sent offsite.
My problem occurs when the tape is in use when the script runs - a tape in use cannot be frozen or suspended:
"requested media id is in use, cannot process request"
As soon as the current backup job finishes the next one starts and again uses the tape I am trying to send offsite.
Is there any command I can use to tell NB to stop using a tape (once it's done with the current backup)?
12-14-2011 07:31 AM
and suspend the media when the job finishes
12-14-2011 02:15 PM
There is nothing in NBU besides the standard wait time (i think its 5 or 10 minutes)
What i used previous is a cmd line script that would start the vault processing. This allowed me to add a much longer wait time and this helpped but sometimes you have backup jobs that will run for many hours and to have vault wait that long really wasnt an option for us.
I would suggest using some form of script for the vault process, see example below for unix.
# Run command to change process control limit to 512 so vault does not core dump due to large image lookup
print "$(date) Running export LDR_CNTRL=MAXDATA=0x20000000..." >> ${LogFile}
export LDR_CNTRL=MAXDATA=0x20000000
# Run the vault program; input is library/vault/profile
print "$(date) Running vltrun 0/vault1/daily..." >> ${LogFile}
/usr/openv/netbackup/bin/vltrun 0/vault1/daily
# Capture the lists of tapes to be ejected today
for Vault in vault1
do
cd /usr/openv/netbackup/vault/sessions/${Vault}
SID=$(cat session.last)
MediaHost=$(head -1 sid${SID}/eject.list | awk '{print $3}')
# The input for this next loop is the list of tapes being ejected minus
# the NetBackup catalog tape
print "$Vault eject list..." >> ${LogFile}
cat sid${SID}/eject.list >> ${LogFile}
ToBeEjectedCount=$(($(cat sid${SID}/eject.list | wc -l) - 1))
# for Volid in $((tail +2 sid${SID}/eject.list | awk '{print $1}'; cat sid${SID}/nbudb.media 2>/dev/null) | sort | uniq -u)
for Volid in $(tail +2 sid${SID}/eject.list | awk '{print $1}')
do
print "$Vault $MediaHost $Volid" >> ${EjectList}
done
done
# Sleep for 10 minutes to allow time for the tapes to unmount
print "$(date) Sleeping for 10 minutes..." >> ${LogFile}
sleep 600
# If the number of tapes to be ejected is more than the 3584 library I/O station can
# handle we need to send an Email alert to OPS
if (( ToBeEjectedCount > MailSlots ))
then
print "Please empty the 3584 I/O station in 5 minutes to allow Vault eject to continue." \
| /usr/bin/mailx -s "${SiteCode} Vault Alert: More than ${MailSlots} tapes to eject" $NB_Admin $OPS
fi
12-14-2011 07:52 PM
@wrobbins, can it be a challenge here using bpend_notify, because we need a way to identify the actual tape being used before running a suspend against it?
Doug's script to make use of Vault seems great, as Vault can run suspend for you but again the "situation" still exist, as backup is still going to pickup the same tape before a Vault can be started.
I think about using "maximum mount" setting, but there is a bit too extreme - and very tedious as we need to set this "mount count" all the time.
@sclind, are the next backup going to be the same policy and same schedule as the one just completed, if not you can just configure them to use different volume pool which will pickup a different tape.
12-15-2011 01:37 AM
IMHO, your actual problem is the fact that the backups are still running by the time you try to freeze and eject.
You need to check backup schedules to see if backups can start earlier, increase MPX, improve network, add more tape drives, etc... to ensure that backups are done by the time you need to eject.
Another suggestion is to suspend scheduling. This will allow current backup to complete and not kick in more backups.
nbpemreq -suspend_scheduling
nbpemreq -resume_scheduling
Please consider suspending tapes instead of freezing them. Freezing is persistent past data expiration. Suspending will prevent further usage but allow normal tape recycling upon data expiration.
12-15-2011 07:47 AM
Marianne - I agree with your analysis of the root of the problem, but I have no control over the submission of the backups.
If I knew the running backup(s) were going to complete soon I could suspend scheduling. But the people submitting the backups would then be screaming at me when their backup submissions failed. (It's tough to be in charge of the backups but not in control of the backups).
I don't understand the logic in Netbackup that does not allow a tape in use to be suspended or frozen. It would be so simple to let the current operation complete and then not allow any additional writes to that tape.
12-15-2011 11:20 AM
Identitifying which tape needs to be frozen/suspended has been what has kept me from jumping on using bpend_notify. More work than I really want to do right now :)
And I never know what the next backup will be (they are rman backups submitted by the clients),
12-19-2011 04:11 AM
I think it's not possible when you've got jobs in queue with the same retention and media server... I think you should eject only tapes not currently mounted.
12-19-2011 04:50 AM
I agree with Marianne and Marek on this ...
If a tape is in use and so is being written to you cannot suspend it (do not use freeze) as it is being written to and so a suspend or freeze would not be permitted as it is updating its database information / contents list.
The only way i can see as a possible work around to this (and this would only work for file system backups) is to enable checkpoint restart in all policies and then suspend all jobs, leave for 5 or 6 minutes for tapes to be dismounted from drives, suspend all media, eject all used media resume all jobs:
bpdbjobs -suspend type=BACkup
wait 6 minutes
suspend and eject all media
bpdbjobs -resume type=BACkup
The jobs would then resume using new tapes.
Note the odd case of BACkup used in thgis command
Other than this there is no workaround
12-19-2011 05:35 AM
Another thought - are the backups that get submitted all the time database archive logs by any chance?
Motivate for some disk to write regular log backups to. Duplicate these disk backups on a regular basis to tape (e.g. every 6 or 12 hours). This way you should be able to ensure that no tapes are in use when you need to eject.