include_lists not threated recursive by netbackup

Holger_SCC · ‎05-26-2016

Hello,

I need to exclude everything and then include a directory, say /some/directory/thisone

Of course I could simply add it to the include list, then netbackup will backup the directory itself and all files in it - however, it will not backup the content of any subdirectories in it, f.e. /some/directory/thisone/anotherone/some_file

Includes don´t seem to be recursive while excludes are.

If you put /some/filesystem in your exclude_list, everything below (files and directories) will be excluded.

If you put /some/filesystem in your include_list, only files in /some/filesystem will be included. Files in directories won´t be included.

Is there any way to change this behaviour?

Let me explain what I want to achive. May someone has a better idea of how to implement this.

I have a cluster that consists of 2 nodes, nodeA and nodeB. Both are netbackup clients and backup their local filesystems normally.

However, there are (so called shared filesystems) that can be mounted on both nodes (however, only on one node at any time, never on both simultaneously).

Say, one shared filesystem is /shared.

When the clustered application is running on nodeA, /shared is also mounted on nodeA.

When the clustered application is switching to nodeB, /shared gets mounted on nodeB.

Now the problem is: Backup and restore of /shared must work anytime, regardless where the application is running (nodeA or nodeB).

I cannot backup /shared normally. In this case, when nodeA backed up /shared and then the application (and filesystem /shared) switches to nodeB, nodeB won´t have access to the backups of /shared.

I resolve it this way:

I define a third backup client on the netbackup master server, say nodeS (for nodeShared).

His IP-Adress is configured on nodeA (as an IP alias) when the app runs on nodeA and on nodeB when the app runs on nodeB.

nodeA and nodeB have an exlude list containing /shared as they shell not backup /shared anymore.

nodeA and nodeB have an exclude list of / (so, everything) for the policy of nodeS.

nodeA and nodeB have an include list of /shared for the policy of nodeS.

Works fine. nodeA and nodeB don´t backup /shared to their client/policy anymore.

And /shared gets backuped (and is restorable) on nodeA or nodeB, wherever the ip alias nodeS is currently configured.

However, when someone creates /shared/somedir and puts files there they wont get backuped.

How can I tell netbackup to backup everything in /shared including subdirectories using include lists?

Or is there any other way to backup /shared? How is backup on cluster nodes (f.e. AIX HACMP / PowerHA) normally done regarding shared filesystems?

Nicolai · ‎05-26-2016

Is "Cross mount points" selected in the policy ?

/share - is that a NFS mount, else you may need to select "Backup network drives".

running bpmount will show if Netbackup thing the folder is local or not.

I don't recall include_list not being recursive ..

Holger_SCC · ‎05-26-2016

Cross mount points is not selected (as recommended). But there are not several filesystems involved - only subdirectories within a filesystem (/shared).

/share is not a nfs mount. It is a normal filesystem located on disks accessible on both nodes and gets mounted on whatever node the application runs.

>I don't recall include_list not being recursive ..

I was quite surprised finding out that it is not recursive. Try it out ;)

sdo · ‎05-26-2016

Why can't you backup /shared via a cluster VIP name/address which goes hand-in-hand with the resource group containing the resource named /shared.

I.e. whenever the resource group is failed over between nodes then the cluster VIP name/address and the /shared resource mount point both failover together at the same time, and so the /shared resource is always reachable via a backup policy which has a client name of the cluster VIP name.

a_la_carte · ‎05-26-2016

Absolutely, agree with Sdo.

This is how we have been backing up file systems on many cluster nodes in our environment.

VIP name/address is the ideal recommendation for the subjected concern. Same applies for File-system backup, DB-in cluster backup etc.

Holger_SCC · ‎05-26-2016

sdo, as you can read in my first post I am doing exactly what you are suggesting. maybe I did not make it well to explain...

Now, how should I setup the in/exclude lists?

I want to exclude eveything beside /shared.

Putting / into exclude list and /shared in include list wont backup subdirectories in /shared.

Putting all (current existing) fs unter / into exclude (besides /shared) and nothing in include will also backup added filesystems unter /. Thats not a big problem, but if there is a cleaner solution I´d like to know.

Nicolai · ‎05-26-2016

To fully understand what is going on do the following:

To bp.conf on the client add

VERBOSE = 5

create /usr/openv/netabckup/logs/bpbkar

touch /usr/openv/netbackup/bpbkar_path_tr

Re-run backup, attach the debug log file to a post - do NOT bulk paste the debug text. Also check sub directories under /share are not mount points or dynamic links.

Background :

http://www.veritas.com/docs/000026979

Remeber to remove the touch file and the verbose statement in bp.conf

Holger_SCC · ‎05-26-2016

Nicolai,

I could do this, however, then we have a log of the current (fine running) backup.

I do not want to solve a problem, what I want is to know how to configure this shared fs backup.

My current solution has the problem that include statements are not recursive.

Should you want to recreate:

Create policy XX

put / into its exclude file

create the directory /some/directory

put /some/directory into the include file of policy XX

create a file /some/directory/myfile , run the backup and see how it gets backed up

create an empty dir /some/directory/mydir, run the backup and see how the empty dir gets backed up

create an file /some/directory/mydir/myfile , run the backup and notice that it will not get backed up.

This behaviour causes problems with my current cluster backup that has a VIP and a separate policy (thus, separate include and exclude files) and has / (so, everything) in its exclude file and /shared in its include file.

When someone creates /shared/mydir/myfile that will not get backed up.

How do you do cluster filesystem backup (on IBM PowerHA)?

Holger_SCC · ‎05-26-2016

sdo,

nice possible solution.

I see 2 drawbacks:

1) Whenever a filesystem gets created, the unix-group has to contact the backup-group to tell them to put the new fs into the policy (or ths fs won´t be backed up...)

2) I cannot see easily on the unix node what gets backed up. When I use all_local_drives, I have control of what gets backed up on the client (by using include and exclude). So, monitoring scripts can check what fs I have and what gets backed up, compare this lists and alert when something is misconfigured.

When the filesystem-list is contained in the policy it is more difficult to query it on the node. I might use bppllist for it, but on our systems the /usr/openv/netbackup/bin/admincmd/ stuff is only installed on nodes using lanfree backup. Is that normal? Should it be included in network/lanbased backup, too?

In any case, thx for the possible solution. Now I have one more possibility to think about.

sdo · ‎05-26-2016

IMO - don't use exclude+include lists for this. It would seem that you are trying to use ALL_LOCAL_DRIVES for your backup policies. Most of us wouldn't do this for backups of clustered clients. Instead, what I would typically do for a clustered client(s) config is:

policyM1 - clientM1 - selection: only those file-systems / mount-points which are private to clientM1

policyM2 - clientM2 - selection: only those file-systems / mount-points which are private to clientM2

policyR1 - clientR1 - selection: only those file-systems / mount-points which are in resource group R1

...

policyRn - clientRn - selection: only those file-systems / mount-points which are in resource group Rn

(where M1/M2 means member nodes 1/2/...n, and R1/R2 means resource groups 1/2/...n)

...but usually both member nodes are very similar with the same names for their private file-systems/mount-points, so what you should be able to do for the member nodes is:

policyM - clientM1 and clientM2 - selection: only those file-systems / mount-points which are private to both clients

...i.e. if your cluster has multiple resource groups, then one policy per resource group per cluster...

...and there shouldn't be any need to exclude/include anything that is cluster related.

Yes we end up with a few more policies, but this is much easier to administer and control than excludes+includes, and is visually self documenting in the backup policy configuration with nothing hidden away (in exclude/include lists) on the client.

sdo · ‎05-26-2016

HC - 1) Whenever a filesystem gets created, the unix-group has to contact the backup-group to tell them to put the new fs into the policy (or ths fs won´t be backed up...)

sdo - Yes. The way I see it is that this is just another one of thosefacts of life. OS admins and DBA admins and backup admins and Network admins are all very used to working together to create custom firewall rules for what we need to do - and, to me, working together to create custom backup configs is just the same thing. One way to mitigate fs being missed, or avoid backup gap, is to force through "backup config awareness" by never implementing a new backup client until a questionnaire/request-form has been filled in, i.e. as part of your ITIL change or "request a backup" workflow. And the questioonaire can/would/should ask several questions around clustering and resource groups.

.

HC - 2) I cannot see easily on the unix node what gets backed up. When I use all_local_drives, I have control of what gets backed up on the client (by using include and exclude). So, monitoring scripts can check what fs I have and what gets backed up, compare this lists and alert when something is misconfigured.

sdo - Sounds like you already have a mechanism in place to spot any new gaps being introduced. What I have done in the past is to have a weekly script which called "bpcoverage" to every client name, and also collect a detailed policy listing of all policies, and I also had an semi-static information list (which I would edit as necessary) of which "client names" were members of which cluster-names, and a separate list of deliberately excluded paths/volumes/mounts, and so the script would process all this and spit out two (hopefully empty each week) lists... i) a list of new file systems missing from backups... and ii) a list of file systems listed in backup policies which no longer existed on clients... and an email would be sent to the platform teams highlighting these gaps, so that they could then raise requests for backups to be amended. 80% of all clients are straight forward. 20% of backup clients generate 80% of our administrative workload. So, I tried to reduce this by automating the gap detection.

.

HC - When the filesystem-list is contained in the policy it is more difficult to query it on the node. I might use bppllist for it, but on our systems the /usr/openv/netbackup/bin/admincmd/ stuff is only installed on nodes using lanfree backup. Is that normal? Should it be included in network/lanbased backup, too?

sdo - Yes /admincmd is only present on NetBackup Servers, and not on "clients". Here's a tip, for any particularly complex backup comfigurations - I make sure I let the teams know, and when I'm able to I will leave a text file at the top of volumes / mount-points saying something like "_WARNING_ - this server has a custom backup configuration.txt". But really, your first two points above address this third point.

Nicolai · ‎05-30-2016

I want to see the bpbkar from your system, to see how bpbkar evaluates the include/exclude lists.

I will test locally if time permits

Nicolai · ‎05-30-2016

I just tested exclude_list / include_lists and don't see the behvaior on Linux

Backup selection : /some

more exclude_list.test
/some

more include_list.test
/some/directory

bplist -C nbukvm01 -l -R /some
drwxr-xr-x root root 0 May 30 12:38 /some/directory/
drwxr-xr-x root root 0 May 30 12:45 /some/directory/mydir/
-rw-r--r-- root root 10 May 30 12:45 /some/directory/mydir/mydirfile
-rw-r--r-- root root 0 May 30 12:38 /some/directory/testfile

From bpbkar debug log:

12:50:48.739 [25541] <2> bpbkar SelectFile: cwd=NULL path=/some
12:50:48.739 [25541] <2> bpbkar SelectFile: INF - Resolved_path = /some
12:50:48.739 [25541] <2> bpbkar resolve_path: INF - Actual mount point of /some is /some
12:50:48.739 [25541] <4> is_excluded: Excluded /some by exclude_list entry /some
12:50:48.739 [25541] <2> bpbkar SelectFile: cwd=/some path=directory
12:50:48.739 [25541] <4> is_excluded: Included /some/directory by include_list entry /some/directory
12:50:48.739 [25541] <2> bpbkar PrintFile: /
12:50:48.739 [25541] <2> bpbkar PrintFile: /some/
12:50:48.739 [25541] <2> bpbkar PrintFile: /some/directory/
12:50:48.739 [25541] <2> bpbkar SelectFile: cwd=/some/directory path=testfile
12:50:48.739 [25541] <4> check_file_sparseness: Device changing from 0 to 64768
12:50:48.740 [25541] <2> bpbkar PrintFile: /some/directory/testfile
12:50:48.740 [25541] <2> bpbkar SelectFile: cwd=/some/directory path=mydir
12:50:48.740 [25541] <2> bpbkar PrintFile: /some/directory/mydir/
12:50:48.740 [25541] <2> bpbkar SelectFile: cwd=/some/directory/mydir path=mydirfile
12:50:48.740 [25541] <2> fscp_is_tracked: disabled tla_init
12:50:48.740 [25541] <2> bpbkar PrintFile: /some/directory/mydir/mydirfile
12:50:48.740 [25541] <2> bpbkar resolve_path: INF - Actual mount point of / is /
12:50:48.740 [25541] <4> bpbkar expand_wildcards: end backup for filelist /some
12:50:48.740 [25541] <2> bpbkar PrintFile: /VRTS_IMAGE_SIZE_RECORD
12:50:48.740 [25541] <4> bpbkar main: INF - Client completed sending data for backup
12:50:48.740 [25541] <2> bpbkar write_eot: JBD - bpbkar waited 0 times for empty buffer, delayed 0 times
12:50:48.740 [25541] <2> bpbkar main: INF - Total Size:10

Holger, do you have both exlude_list and exclude_list.some_policy defined ?

Best Regards

Nicolai

Holger_SCC · ‎06-07-2016

It was a misunderstanding between me and the guy that reported the problem to me.

Say, you exclude / and include /some/directory.

Any file or directory in /some/directory will be backed up including all subdirectories.

However, if you create a filesystem /some/directory/another_dir/mountpoint only the empty directory /some/directory/another_dir/mountpoint will be backed up. But no files or directories below.

However, simply adding /some/directory/another_dir/mountpoint to the include list fixes this.

Then you have

/some/directory/

/some/directory/another_dir/mountpoint

in your include list which might look strange at first sight.

You might also enable cross mount points (which we prefer not to enable), I could not test this.

But for me one entry in the include list for every filesystem is fine.

VOX

include_lists not threated recursive by netbackup