cancel
Showing results for 
Search instead for 
Did you mean: 

5240 Appliances - V3.0 - Vaulting Tape Backups

SYM-AJ
Level 5

I have 2 x 5240 Appliances configured accross 2 sites, all functioning well - using AIR between the 2 domains.

On a monthly basis the SLP in effect replicates to the remote site, imports and duplicates to tape.

Vault is licensed and in place.

What is the best way to produce a picking list for the tapes produced for the Monthly backups ?  I am not experienced with Vault, but from looking at it I don't want to duplicate, I want to make a catalog backup, I don't want to eject (or do I ?) but i DO want to produce the reports.

I MUST perform one of the Duplication, Catalog or Eject functions in order to have a valid profile.

On Windows Masters I have trigerred a batch file running 'bpimagelist -media -hoursago xx -idonly' and not used Vault at all.  My master servers here are both 5240 appliances.

Best approach ?

AJ

 

1 ACCEPTED SOLUTION

Accepted Solutions

This issue was finally resolved after many weeks with support.

Issue seems to be related to DNS, but could not directly locate the problem.

To fix we did two things:

edit the main.cf file to put square brackets [xxx] around the relayhost entry - apparently this stops it checking the mx record in a lookup.

add the domain where the email servers reside to the resolv.conf file as a search domain.

After the above changes I restarted postfix and did a postfix flush - everything then burst into life.

The emails are now sending, but the nslookup of the mail server with the type set to mx still fails !!

We can (and always could) ping the email server, and do an nslookup of it with type a.

Strange one but the problem is no longer with us so 'sort of' happy.

AJ

View solution in original post

11 REPLIES 11

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

You could just run this 

Media Written Report
The Media Written report identifies volumes that were used for backups within the specified time period.  
# cd /usr/openv/netbackup/bin/admincmd
# ./bpimagelist -A -media [-d <start_date> <start_time> -e <end_date> <end_time>]

 

As for Vault from what I know it will report on the Vault sessions i.e. Vault should have duplicated something. So you'd probably need to get Vault to do the DUP and not SLP.

 

I think @Michael_G_Ander once commented on a way to do this but I'm not 100% sure it was him. You could search in more detail about Vault and Duplicating SLPs, which will lead you down the right track I think.

watsons
Level 6

Commenting from the view of Vault option...

You can create a vault profile, skip duplication, enable catalog backup (optional as well), enable both eject & report. To track media offsite, you have to eject because that's the purpose of vault. Whether you want to immediately eject or defer the eject is up to your operation procedure.

In your "select backup" tab, select the schedule or retention to let vault knows to pick up only monthly tapes. In report tab select what kind of reports you want, either output to a file, or email it back to someone.

Many thanks for the input.

If the library has no facility (I/E ports defined) will the eject still complete OK from VAULT's perspective (obviously we will have to manually pull out the tapes).

I am simulating a month end this weekend, so will configure VAULT to perform a catalog backup an eject all the tapes produced by the monthly Schedule.

Thanks again,

AJ

Marianne
Level 6
Partner    VIP    Accredited Certified

Best to choose 'deferred eject' if you need to remove tapes manually.

Thanks Marianne.

I have looked at all the docs referring to Immediate & Deferred Eject and am a little confused.

The tape libraries here do not have I/E slots / Mailslots defined.  However during a brief test last week I did run a Vault test with an eject and the tape was moved to the offsite Volume Group and disappeared from the Media/Robots tape list on the admin console.  When I re-inventoried specifying Empty media access port prior to update the tape re-appeared........

Assuming I have no I/E ports, what will happen if I select Immediate Eject ?  Will the job fail or will it just assume they have been removed from the library (which we will do on Monday) ?

If I select Deferred Eject, then I guess on Monday I will select the Deferred Option from the Admin Console / Actions menu - but what does that do ?  Does it simply mark them as ejected, move volume group - and then we manually take them out (or do we take them out first and then run the deferred eject option) ?

Thanks,

AJ

Marianne
Level 6
Partner    VIP    Accredited Certified

I don't know what will happen if you select immediate eject.
My guess is that the job will fail.

I would configure Deferred eject as per recommendation in the manual, run the profile manually or automatic in a Vault policy and then eject.

Extract from the manual:

If you use a library that does not have a MAP, you must remove the media from
the library slots manually. You also have to perform the eject operation in Vault so
that the appropriate database entries are completed. Although you can use automatic
eject, Veritas recommends that you use deferred eject to avoid resource contention
with other NetBackup activity and you do not neglect to remove the media from the
robot. The manual eject operation serves as a reminder to remove the media.
To use deferred eject for a library that does not have a MAP, do the following:

■ Configure the profiles for deferred eject.
■ Eject the media manually.
See “About ejecting media” on page 121.
■ Remove the media from the library slots.
Do not inventory the robot until you remove the media from the MAP or library slots.
If you do, you have to revault the media.

I don't have an env. to simulate your scenario - i.e. no mailslot for eject.

But from my experience of ejecting, when there is no more mailslot available - for example, eject 10 tapes and it only has 4 mailslots, the eject job will first eject 4 tapes, stay there waiting indefinitely for available mailslot until operator takes out those 4 tapes from mailslots, and move on.

So if there is a timeout for the job, it will eventually end with failure because eject is considered not completed.

As per Marianne suggested, use defer eject in this situation and use manual (or scripted) command to assign the tape to offsite volume group and location.

OK - Update on this one.

I defined 12 Mailslots on each of the libraries (they have 96 slots each so no issues there), and we produce around 6 LTO5 tapes each month end - so we should be good to go there.

All duplications (via SLP) were complete by early Saturday morning.

The Vault schedule kicked in at 9am this morning and performed as follows;

1. Secured the catalog to tape - OK

2. Performed the Immediate Eject, and moved all of the data tapes and the catalog tape to the mailslots - OK

3. The tapes were marked as having the expected Vault Name / Volume Group - and moved to the volume group

4. The Vault policy completed status 0

All looked good, except no reports were emailed.

Whilst the robotic volume group is 000_00000_TLD (default) I have my offsite volume group set to xx_Offsite_Vault.  Is this a valid name or do I have to stick with the 000_00000_TLD format ?  It has moved the volumes to this off-site volume group but just not issued any reports ?

In summary, all looks good to me - tapes are in the expected volume group, vault name, mailslots etc., only issue is that no reports were emailed !

I have looked into the report location on the appliance (/usr/openv/netbackup/logs/user_ops/vault/sub-dir-name) and the reports are there and correct !!

Any thoughts......

AJ

After looking into a previous post I saw a similar error, and the poster stated the following:

Solved it myself, had to edit the main.cf file in /etc/postfix and add the smtp_sasl_security_option = nonanonymous since it wasn't specified. Once I did that I edited /etc/postfix/generic to use the correct sender name.

He stated the following:

There was no 'error' message persay, the emails just weren't being sent but the tapes were still ejecting though. In order to see the 'errors' I had to look at the /var/log/maillog and /root/dead.letter

Feb 14 09:17:44 masterServer postfix/smtp[278258]: 1393240E52: to=<myEmail>, relay=relayIP[relayIP]:25, delay=176713, delays=176713/0.26/0.05/0, dsn=4.7.0, status=deferred (SASL authentication failed; cannot authenticate to server relayIP[relayIP]: no mechanism available)

I have no /root/dead.leter file, but do have the /var/log/maillog file and the following are repeated continuously throughout it at approx 5 min intervals:

May 15 11:11:52 irlclin018 postfix/qmgr[208301]: BD50C1C073D: from=<root@irlclin018.domain>, size=1187, nrcpt=3 (queue active)

May 15 11:11:52 irlclin018 postfix/error[105273]: BD50C1C073D: to=<my-email@company.com>, relay=none, delay=6907, delays=6907/0.21/0/0.06, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=smtp-server type=MX: Host not found, try again)

May 15 11:11:52 irlclin018 postfix/error[105273]: BD50C1C073D: to=<group1-email@company.com>, relay=none, delay=6907, delays=6907/0.21/0/0.09, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=smtp-server type=MX: Host not found, try again)

May 15 11:11:52 irlclin018 postfix/error[105273]: BD50C1C073D: to=<group2-email@company.com>, relay=none, delay=6907, delays=6907/0.21/0/0.14, dsn=4.4.3, status=deferred (delivery temporarily suspended: Host or domain name not found. Name service error for name=smtp-server type=MX: Host not found, try again)

Why have I got relay=none in the above ??  Surely this should contain the SMTP server.....

I can ping the smtp-server from the appliance OK.

As I can't see the reference anywhere to SASL Authentication failed I am unsure as to whether or not to insert the entry into the main.cf file ? 

AJ

This has got to be something out of Netbackup. What I will do is first checking the name resoluton:

Run nslookup 

> set type=MX
> smtp.xxx.net     <== put your SMTP server next

Does it resolve? If not, fix that first. Then go on to check if it can connect to the SMTP port of mail server.

This issue was finally resolved after many weeks with support.

Issue seems to be related to DNS, but could not directly locate the problem.

To fix we did two things:

edit the main.cf file to put square brackets [xxx] around the relayhost entry - apparently this stops it checking the mx record in a lookup.

add the domain where the email servers reside to the resolv.conf file as a search domain.

After the above changes I restarted postfix and did a postfix flush - everything then burst into life.

The emails are now sending, but the nslookup of the mail server with the type set to mx still fails !!

We can (and always could) ping the email server, and do an nslookup of it with type a.

Strange one but the problem is no longer with us so 'sort of' happy.

AJ