Synthetic fail. 671 (query for list of component ...

sshagent · ‎01-03-2014

NetBackup 7.5.0.6

Linux Master(and client...its backing itself up)

1388761662 1 2 32 nbmaster 0 0 0 *NULL* bpdbm File: ../nbe_cat_image.cpp Line: 3191: File record not found in synthetic backup,

1388761662 1 2 16 nbmaster 1002 0 0 *NULL* DBImageQuery::getNextRecord unable to receive the response to the image query, error=the database contains conflicting or erroneous entries(238)

1388761662 1 2 16 nbmaster 1002 0 0 *NULL* SynthJob::queryImages caught synth EXCEPTION, error=the database contains conflicting or erroneous entries(238)

1388761662 1 2 16 nbmaster 1002 0 0 *NULL* SynthJob::doStart preparation for generating synthetic image failed, error = query for list of component images failed(671)

1388761662 1 2 16 nbmaster 1002 0 0 *NULL* SynthJob::cleanup end bpsynth with status = query for list of component images failed(671)

1388761663 1 68 4 nbmaster 1002 1002 0 nbmaster nbpem CLIENT nbmaster POLICY isilon-mpc-pit-1-ext1-1 SCHED synth EXIT STATUS 671 (query for list of component images failed)

I've seen other mentions of this in forum posts here but they suggested upgrading to 7.1 ...which we're well past that :)

I've got verbose to 5, and still can't find any real details anywhere of what exactly the conflicting/erroneous entity is. Hardly helpful.

In case it matters, the policy is sometimes being updated with additional includes each night before the regular incremental. But the incremental runs after the update, and then the Synthetic follows. But presumably that won't cause problems.

Any pointers to what might be the problem, or where i can find what database entity is conflicting/erroneous would be helpful. Maybe even some Synthetic gotcha's your aware of. Despite having worked with NetBackup for many years i've only recently been playing with Synthetics.

Thanks for your time.

sri_vani · ‎01-03-2014

bpsynth log can give us more details.

Please verify below link and try the backup:

http://www.symantec.com/business/support/index?page=content&id=TECH67048

execute a traditional FULL backup and then an incremental backup before attempting to synthesize another FULL backup.

sshagent · ‎01-03-2014

I know i can run a full backup, incremental and then Synth(hopefully) but that just a workaround.

I'm wanting to know the cause or what is causing this. Having to regularly run Fulls again defeats the purpose of Synth's.

Activity Log shows no useful info either...

01/03/2014 10:17:46 - Info nbjm (pid=6572) starting backup job (jobid=1006) for client nbmaster, policy isilon-mpc-pit-3-ext2-1, schedule synth

01/03/2014 10:17:46 - Info nbjm (pid=6572) requesting NO_STORAGE_UNIT resources from RB for backup job (jobid=1006, request id:{32F57FB0-748A-11E3-9BA0-5BD6067DA271})

01/03/2014 10:17:46 - requesting resource nbmaster.NBU_CLIENT.MAXJOBS.nbmaster

01/03/2014 10:17:46 - requesting resource nbmaster.NBU_POLICY.MAXJOBS.isilon-mpc-pit-3-ext2-1

01/03/2014 10:17:46 - granted resource nbmaster.NBU_CLIENT.MAXJOBS.nbmaster

01/03/2014 10:17:46 - granted resource nbmaster.NBU_POLICY.MAXJOBS.isilon-mpc-pit-3-ext2-1

01/03/2014 10:17:46 - estimated 13957635963 kbytes needed

01/03/2014 10:17:46 - begin Parent Job

01/03/2014 10:17:46 - begin Synthetic: Start Notify Script

01/03/2014 10:17:46 - Info RUNCMD (pid=32558) started

01/03/2014 10:17:46 - Info RUNCMD (pid=32558) exiting with status: 0

Operation Status: 0

01/03/2014 10:17:46 - end Synthetic: Start Notify Script; elapsed time 0:00:00

01/03/2014 10:17:46 - begin Synthetic: Execute Script

01/03/2014 10:17:46 - started process bpbrm (pid=32566)

Operation Status: 671

01/03/2014 10:17:50 - end Synthetic: Execute Script; elapsed time 0:00:04

01/03/2014 10:17:50 - begin Synthetic: Stop On Error

Operation Status: 0

01/03/2014 10:17:50 - end Synthetic: Stop On Error; elapsed time 0:00:00

01/03/2014 10:17:50 - begin Synthetic: End Notify Script

01/03/2014 10:17:50 - Info RUNCMD (pid=32604) started

01/03/2014 10:17:50 - Info RUNCMD (pid=32604) exiting with status: 0

Operation Status: 0

01/03/2014 10:17:50 - end Synthetic: End Notify Script; elapsed time 0:00:00

Operation Status: 671

01/03/2014 10:17:50 - end Parent Job; elapsed time 0:00:04

query for list of component images failed (671)

sri_vani · ‎01-03-2014

This error sequence is noted when a traditional FULL backup no longer exists in the catalog and a synthetic backup job is initiated. The FULL backup has most likely expired.

Check what is the retention period for the full backup.

The reason might be traditional full backup was nt availble during the synthetic backup triggered.

sshagent · ‎01-03-2014

Not in this case. I've checked the obvious stuff like this.

root@nbmaster: bpimagelist -l -d 12/27/2013 -policy isilon-mpc-pit-3-ext2-1 -U

Backed Up Expires Files KB C Sched Type On Hold Index Status Policy

---------------- ---------- -------- -------- - ------------ ------- ------------ ------------

01/02/2014 22:09 01/16/2014 144016 195312342 N Differential 0 0 isilon-mpc-pit-3-ext2-1

01/01/2014 22:35 01/15/2014 110664 39154387 N Differential 0 0 isilon-mpc-pit-3-ext2-1

12/31/2013 22:34 01/14/2014 149299 284390969 N Differential 0 0 isilon-mpc-pit-3-ext2-1

12/30/2013 22:41 01/13/2014 619389 2075683546 N Differential 0 0 isilon-mpc-pit-3-ext2-1

12/27/2013 10:49 01/10/2014 1838465 11849412426 N Full Backup 0 0 isilon-mpc-pit-3-ext2-1

..see a full backup and subsequent incrementals. plenty of expiration room

Nicolai · ‎01-03-2014

Please review this tech note: http://www.symantec.com/docs/TECH68187 and verify if you follow the practice outlined.

One way of getting around the classic full before synthetic full take over is to configure a schedule as (traditional) full with a frequency of 3000 weeks.That schedule will only run the first time a client is added to the policy. The incremental and weekly synthetic schedules of cause also need to be located in the same policy

Best Regards

Nicolai

PS: Please post the policy configuration and attach as a file.

sri_vani · ‎01-03-2014

did you verify the bpsynth log may give us some clue

sshagent · ‎01-03-2014

Sri Vani,

What you mean "verify the bpsynth log". Here is the error section from the bpsynth log

07:49:10.766 [7431] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2

07:49:20.764 [7431] <2> set_job_details: Tfile (991): LOG 1388753360 16 DBImageQuery::getNextRecord 7431 unable to receive the response to the image query, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <2> send_job_file: job ID 991, ftype = 3 msg len = 168, msg = LOG 1388753360 16 DBImageQuery::getNextRecord 7431 unable to receive the response to the image query, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <16> DBImageQuery::getNextRecord: unable to receive the response to the image query, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <2> set_job_details: Tfile (991): LOG 1388753360 16 SynthJob::queryImages 7431 caught synth EXCEPTION, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <2> send_job_file: job ID 991, ftype = 3 msg len = 135, msg = LOG 1388753360 16 SynthJob::queryImages 7431 caught synth EXCEPTION, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <16> SynthJob::queryImages: caught synth EXCEPTION, error=the database contains conflicting or erroneous entries(238)

07:49:20.764 [7431] <2> SynthJob::queryImages: returning error code = 671(SynthJob.cpp:1010)

07:49:20.764 [7431] <2> SynthJob::prepareImages: failed to find images, error = 671(SynthJob.cpp:608)

07:49:20.765 [7431] <2> set_job_details: Tfile (991): LOG 1388753360 16 SynthJob::doStart 7431 preparation for generating synthetic image failed, error = query for list of component images failed(671)

07:49:20.765 [7431] <2> send_job_file: job ID 991, ftype = 3 msg len = 147, msg = LOG 1388753360 16 SynthJob::doStart 7431 preparation for generating synthetic image failed, error = query for list of component images failed(671)

07:49:20.765 [7431] <16> SynthJob::doStart: preparation for generating synthetic image failed, error = query for list of component images failed(671)

07:49:20.765 [7431] <2> set_job_details: Tfile (991): LOG 1388753360 16 SynthJob::cleanup 7431 end bpsynth with status = query for list of component images failed(671)

07:49:20.765 [7431] <2> send_job_file: job ID 991, ftype = 3 msg len = 114, msg = LOG 1388753360 16 SynthJob::cleanup 7431 end bpsynth with status = query for list of component images failed(671)

07:49:20.765 [7431] <16> SynthJob::cleanup: end bpsynth with status = query for list of component images failed(671)

07:49:20.765 [7431] <2> set_job_details: Tfile (991): END_OPERATION 1388753360

07:49:20.765 [7431] <2> send_job_file: job ID 991, ftype = 3 msg len = 25, msg = END_OPERATION 1388753360

07:49:20.765 [7431] <2> JobMonProxy::sendTryMsg: sent try msg to jobd: END_OPERATION 1388753360

(JobMonProxy.cpp:584)

07:49:20.797 [7431] <2> job_monitoring_exex: ACK disconnect

07:49:20.797 [7431] <2> job_disconnect: Disconnected

07:49:20.797 [7431] <2> NBIORInterceptor::destroy: Destroying Interceptor: EMMlib_Orb_EMMlib_Orb(NBIORInterceptor.cpp:61)

07:49:20.798 [7431] <2> Orb::destroyOrb: orb has reference count: 0(Orb.cpp:1676)

07:49:20.798 [7431] <2> exitSynth: exiting with status=query for list of component images failed(671)(bpsynth.cpp:578)

I've also attached the policy.

That technote listed mentions about details appearing in the bpdbm log. I don't get any thing like that.

The only logs that populate with this job are

bpbackup

bpdbjobs

bpstsinfo

bpsynth

nbproxy

sshagent · ‎01-03-2014

Oh Nicolai the full backup has run before, and its schedule is still present. Both the Synth and Full have no days set, I've been running them manually as this setup is semi-new and wanted to make sure it all went smoothly before leaving it to the scheduler.

mph999 · ‎01-03-2014

There is usually some clue somewhere for issues like this, finding it however ..

Ideas ...

cat_export for the client will 'recreate' the header files for this clients backups, these will be located under /usr/openv/netbackup/db.export

If we consider :

client_full

client_inc

client_ful synth

client_inc

The last client inc backup header file will show the previous inc backup, and that header file will show the previous backup etc ... so you can follow back to the last full or synth full.

I would then make sure that the backups all exist, in particular, I would check that the .f files for each of the backups are there, and readable (in /usr/openv/netbackup/db/images/client/<ctime> )

This can be done with:

cat_convert -dump <name of .f file>

This may or may not all check out. Clearly if it does, the issue is elsewhere.

Could be worth running bpdbm -consistency 2

Does this show any issues with any DB entries for this client ?

The other log that might show something would be bpdbm (needs to be at VERBOSE = 5 ) - hopefully this will show a reference to an image.

Regards,

Martin

sshagent · ‎01-04-2014

I was going through that kinda thing last week. I've ran the image cleanup and -consistency (without the 2) which didn't really show anything.

I already have verbose = 5 and no bpdbm log.

I'll run that bpdbm -consistency 2 now and see what that brings up.

Thanks for your time, much appreciated.

Deb_Wilmot · ‎01-05-2014

My two cents...

I have seen this happen under the following scenarios:

- When paths have been added to the policy when allow multiple datastreams is enabled.

**** Doesn't look like that's the case as it isn't enabled per your policy config.

- When corruption occurs on the client due to disk related issues.

- When the network is not 100% stable - dbm may get corrupted information.

- When the STREAMS file is corupt.

The only way to recover from this event is to rerun the 'regular' full.

The bpdbm log at VERBOSE 5 will tell you which image is problematic. The bpsynth log is useful too(bpdbm gets huge, bpsynth allows us to split up the bpdbm log and quickly find the applicable section(s)).

Something seems not quite right with the information you pasted in about the last backups.

Taking out the columns that aren't needed - the output looks like:

Backed Up             Files         KB           Sched Type
01/02/2014 22:09   144016     195312342   Differential
01/01/2014 22:35 110664      39154387    Differential
12/31/2013 22:34   149299     284390969   Differential
12/30/2013 22:41 619389    2075683546     Differential
12/27/2013 10:49 1838465 11849412426    Full Backup

Looking at the above there seems to be a much larger amount of data backed up for the diff on 12/30 then would be expected. *** even considering that most likely the full data is from prior to the 27th if it was a synth full. Did something happen between the full and the diff on the 30th? Filesystem issues? Network outages?

The short story is that I have more questions than answers...

sshagent · ‎01-13-2014

Apologies for the delay, it never rains it pours :) Had other NetBackup stuff to work around.

Anyway i haven't had a chance to run the bpdbm -consistency 2 yet...been busy and the server is too. Will try and get a chance to do this, this week sometime.

Deb,

Firstly thanks for your time :) On to your questions...

Your right no multistreaming is on. We are adding paths to this policy a fair amount, and there are a lot of includes *sigh*.

Its highly unlikely there is disk corruption this data is being worked on, and when it isn't its archived away.

Network seems fine, certainly have noticed anything weird going on. Most work is done over NFS so again, networking issues would likely be spotted fairly quickly.

I've attached my bpsynth log and below is the entirety of my bpdbm log in case it helps. I purged the logs before starting the synth so it should all be just for this one attempt. Thanks for your time folks!

09:40:40.225 [424] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:40:40.225 [424] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 185 0xb9

09:40:40.225 [424] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:40:40.266 [424] <4> TZ: Send Timezone offset=18000

09:40:56.260 [453] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:40:56.260 [453] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 200 0xc8

09:40:56.260 [453] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:40:56.318 [453] <4> TZ: Send Timezone offset=18000

09:40:56.530 [454] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:40:56.530 [454] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 192 0xc0

09:40:56.530 [454] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:40:56.573 [454] <4> TZ: Send Timezone offset=18000

09:40:56.577 [455] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:40:56.577 [455] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 148 0x94

09:40:56.577 [455] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:40:56.654 [455] <4> FileEnumerator: EMM_Config entry LIST_FS_IMAGE_HEADERS = NO - skipping flat files

09:40:56.656 [455] <4> DbmOdbcView::fetchUnsignedBigint: (.2) No more data for query

09:41:00.929 [455] <4> get_DBM_Image: SQL_NO_DATA_FOUND fetching DBM_Image

09:41:00.929 [455] <4> get_DBM_Image: MasterServerKey = < 1000002 > ClientMachineName = < nbmaster > BackupTime = < 1389323086 > ViewID = < 0 >

09:41:00.953 [455] <4> DbmOdbcView::fetchUnsignedBigint: (.8) No more data for query

09:41:00.953 [455] <4> DbmImage::get_DBM_ImageKey: MasterServerKey = < 1000002 > ClientMachineName = < nbmaster > BackupTime = < 1389323086 > ViewID = < 0 >

09:41:00.958 [455] <16> close_CatLock: Unable to unlink (/usr/openv/netbackup/db/images/nbmaster/1389000000/isilon-mpc-pit-1-ext2-1_1389323086_FULL.lck): No such file or directory (2)

09:41:00.961 [455] <4> db_error_add_to_file: File: ../nbe_cat_image.cpp Line: 3191: File record not found in synthetic backup,

09:41:00.961 [455] <32> Function: NBE_CatImage::getRecordListFromSyntheticsTir:image=/usr/openv/netbackup/db/images/nbmaster/1389000000/catstore/isilon-mpc-pit-1-ext2-1_1389323081_INCR.f_imgRecord0 index1=9 index2(filenum)=0 path=/mpc/pit-1-ext2-1/Library_elements/blood_and: File: ../nbe_cat_image.cpp Line: 3191: File record not found in synthetic backup,

09:41:00.963 [455] <8> DbmOdbcConnect::rollbackTransaction: (3360) success

09:41:20.688 [513] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:41:20.688 [513] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 98 0x62

09:41:20.688 [513] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:41:30.256 [562] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:41:30.256 [562] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 274 0x112

09:41:30.256 [562] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:41:30.297 [562] <4> fetch_DBM_SLP: About to fetch DBM_SLP

09:41:30.297 [562] <4> DbmSLP::read_and_send_SLP_list: SLP list is empty

09:41:46.439 [564] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:41:46.439 [564] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 91 0x5b

09:41:46.439 [564] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:41:46.498 [565] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:41:46.499 [565] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 275 0x113

09:41:46.499 [565] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:41:46.537 [565] <4> fetch_DBM_SLP: About to fetch DBM_SLP

09:41:46.538 [565] <4> DbmSLP::read_and_send_SLP_list: SLP list is empty

09:42:01.027 [568] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:495] VxSS magic 1041669 0xfe505

09:42:01.027 [568] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:496] remote_vxss 82 0x52

09:42:01.027 [568] <8> vnet_check_vxss_server_magic: [vnet_vxss_helper.c:538] Ignoring VxSS authentication 2 0x2

09:42:01.104 [568] <4> FileEnumerator: EMM_Config entry LIST_FS_IMAGE_HEADERS = NO - skipping flat files

09:42:01.106 [568] <4> DbmOdbcView::fetchUnsignedBigint: (.2) No more data for query

09:42:01.114 [568] <8> marshal_DBM_Combined_Image_Copy_Fragment: num_fragments 618 wrong. should be 0

sshagent · ‎01-13-2014

whoops double post

VOX

Synthetic fail. 671 (query for list of component images failed)