How to identify the parent backup job of a SLP duplicate

Dear colleagues

I have a Policy using a SLP with the primary backup using a Storage Group with two MSDP STUs, defined as failover and one opt-dup. The HW are 2 Appliances in a cross-backup scenario (Appl. 1 stores to local MSDP STU and opt-dups to second one, and vice-versa).

Sporadically the backup job fails over from first defined (=local) STU to second-defined (remote Appliance STU/MSDP). In that case the secondary opt-duplicate fails because both source and target are now on the same Appliance/MSDP.

Now I need to identify the primary Backup Job ID so I can troubleshoot why the failover occurs. The same SLP is used by multiple Policies, so no chance there...

Any ideas?

Bert

Tags (2)
1 Solution

Accepted Solutions
Highlighted
Accepted Solution!

Re: How to identify the parent backup job of a SLP duplicate

Update:

I am using the backup image's client name and converted ctime to pinpoint the original backup job. With vxlogview I manged to get the *one* line in the log files that shows why: it roughly says orginal targeted STU "full or down", an then it switches to the failover STU. But as umpteen jobs run at the same time to the original STU without any problem it's neither full nor down.

Up to now I haven't gotten an explanation, but the problem persists: As the opt-dup job is part of an SLP, the duplicate job (from STU1 to STU1) is retried and retried and retried... Manual duplicate of the image is not allowed, so it seems there is no way out. But the image is now only on the source and I have no redundancy... Just getting it off my chest...

Regards,

Bert

3 Replies

Re: How to identify the parent backup job of a SLP duplicate

I have an idea... but it might come to nothing... but let's try anyway...

Does the activity monitor job text have any "error text" to help us identify the backup image name which fails to be replicated?  Could you show the text please?

.

If you are unable to show the text then here's what I'm thinking:

1) in activity monitor you see the job ID of the SLP duplication job that fails

2) bpdbjobs -jobid N -all_columns -file job.txt

3) grep "some regular expression to cut the backup image name" job.txt

4) get the failing backup image name

5) bpdbobs -all_columns -file jobs.txt

6) loop through all records of jobs.txt searching for the backup job that created the image

7) you now have the job ID of thr backup job that created the image which is failing to be replicated

HTH.

Highlighted
Accepted Solution!

Re: How to identify the parent backup job of a SLP duplicate

Update:

I am using the backup image's client name and converted ctime to pinpoint the original backup job. With vxlogview I manged to get the *one* line in the log files that shows why: it roughly says orginal targeted STU "full or down", an then it switches to the failover STU. But as umpteen jobs run at the same time to the original STU without any problem it's neither full nor down.

Up to now I haven't gotten an explanation, but the problem persists: As the opt-dup job is part of an SLP, the duplicate job (from STU1 to STU1) is retried and retried and retried... Manual duplicate of the image is not allowed, so it seems there is no way out. But the image is now only on the source and I have no redundancy... Just getting it off my chest...

Regards,

Bert

Re: How to identify the parent backup job of a SLP duplicate

I know you have marked the issue as resolved but just a thought..

If you get hold of the failing backupids, just try running a bpimagelist -backupid <backupid> -L and in the output generated search for Job ID, This should give you the original backup job id that took the backup I believe.