cancel
Showing results for 
Search instead for 
Did you mean: 

vxrsync finds differences after VVR sync using network

tunisGharbi
Level 4
Partner Accredited

Split off from https://www-secure.symantec.com/connect/forums/vvr-block-level-backup

I have a lot of data:  2  RVG,  each RVG has 2 volumes  about 400GB for each one

the network bandwith is about 10 to 40 mb/s

when i synchronise with network it seems to be ok

but when i  check the synchro by this command:

vradmin -g DG_x -verify syncrvg myRvg Mvvr_secondary

I have the following error:

VxVM VVR vxrsync INFO V-5-52-10190 Verification of the remote volumes found differences.

even if  i  ommit tis message when i test the remote site it seems to be ok but when i lunch rman

i have the following error:

Corrupt Block Found

         TSN = 2, TSNAME = SYSAUX

         RFN = 3, BLK = 1043, RDBA = 12583955

         OBJN = -1, OBJD = 3748, OBJECT = , SUBOBJECT =

         SEGMENT OWNER = , SEGMENT TYPE =

Hex dump of (file 3, block 1043) in trace file ....

Corrupt block relative dba: 0x00c00413 (file 3, block 1043)

Bad header found during buffer read

2 ACCEPTED SOLUTIONS

Accepted Solutions

mikebounds
Level 6
Partner Accredited

Can you give more information:

Did you did initial sync using autosync (startrep with "-a" flag) - how long did this take?

Was autosync using smartmove - this is turned on by default for 6.1 and only syncs used filesystem blocks (use vxtune to see if this is on or off), and so syncrvg would then show differences

Did you then run "vradmin -verify syncrvg" and did you umount volumes before running this - if you didn't then this is another reason why it would show differences

Mike

View solution in original post

rsharma1
Level 5
Employee Accredited Certified

Please see: http://www.symantec.com/business/support/index?page=content&id=TECH212211

View solution in original post

16 REPLIES 16

mikebounds
Level 6
Partner Accredited

Can you give more information:

Did you did initial sync using autosync (startrep with "-a" flag) - how long did this take?

Was autosync using smartmove - this is turned on by default for 6.1 and only syncs used filesystem blocks (use vxtune to see if this is on or off), and so syncrvg would then show differences

Did you then run "vradmin -verify syncrvg" and did you umount volumes before running this - if you didn't then this is another reason why it would show differences

Mike

tunisGharbi
Level 4
Partner Accredited

Hi Mike

To start replication over network i have use  'vradmin -a startrep' command:


primary# vradmin -g DG_x -a startrep myRvg

Note File system on primary are mounted and file system on DR site is not mounted

the synchronisation take about 6 to 7 houres for the 2 RVG ( 400 gb for each one)

 

To test synchronisation i have use this procedure:

1. Before verification can begin all applications using data volumes within the primary RVG should be stopped ==>Ok 

2. If using a file system for data storage, all data volumes within the primary RVG should be unmounted ==> Ok

Primary# umount  /global/X
primary# umount  /global/Y

3. Before verification can begin the primary rlink must be reported as up to date meaning that the SRL is empty and there is no outstanding data on the primary node waiting to be replicated to the secondary. This can be verified with the 'vxrlink status' command as shown below:

primary# vxrlink -g DG_x  status ...
8 January 2010 11:25:10 GMT
VxVM VVR vxrlink INFO V-5-1-4467 Rlink is up to date

.
4. Detach primary and secondary rlinks:

primary# vradmin -g DG_x -f stoprep myRvg
VxVM VVR vradmin WARNING V-5-52-92  Secondary data volumes will  become out-of-date.
vradmin: Continue with stoprep (y/n)? y

primary# vxprint -qtrg DG_x myRvg | grep ^rl
rl ...         DG_x      DETACHED STALE    .....

5. Start data verification using the 'vradmin syncrvg' command. Note  that as the '-verify' flag is used data consistency will be verified but there will be no attempt to perform any synchronisation between nodes. Once running verification provides a report on progress detailing which volume is currently being checked, current difference in percent of the volume between primary and secondary nodes for sections already checked, and the total amount of the volume checked so far:

primary# vradmin -g DG_x -verify syncrvg myRvg ....
VxVM VVR vradmin WARNING V-5-52-126 Make sure applications using Primary data volumes are stopped. The result of the volume verification will be invalid if applications  using Primary data volumes are not stopped.
vradmin: Continue with syncrvg -verify (y/n)? y
Message from Primary:

...here i have found VxVM VVR vxrsync INFO V-5-52-10190 Verification of the remote volumes found differences!!!

 

Regards.

mikebounds
Level 6
Partner Accredited

What version of SF are you using and is smartmove enabled - run:

vxtune | grep smart

This should show value of "usefssmartmove" is this attribute is valid for the version of SF you are using

Mike

tunisGharbi
Level 4
Partner Accredited

Hi Mike

 

the output of vxtune | grep smart shows the following:

fssmartmovethreshold                            100             100      N     
usefssmartmove                                        all             all      N 

And  I use SFHA 6.0.3

Thanks in advance for your help and comments

Regads.

mikebounds
Level 6
Partner Accredited

You have smartmove on, but the VVR admin guide says:

The vradmin verifydata command has also been enhanced to leverage VxFS
knowledge of file system blocks in use for verification.

So looks as though vradmin verifydata command has been enhanced to only verify vxfs "used" blocks, so the "vradmin verifydata" should not show differences.

You seem to be doing everything correctly, so I would log a call with Symantec as there maybe an issue with the product.

You could try using vradmin syncrvg - in particular a differential syncrvg should check which blocks are different (which is what verifydata is doing) and then sync these blocks so this should correct the mistake of the startrep which seems to be missing some blocks in your case.

Mike

tunisGharbi
Level 4
Partner Accredited

Hi Mike

I undestand that I can try the following:

1.  I verify the  data with the following command:

# vradmin -g diskgroup verifydata local_rvgname sec_hostname

2. Then I syncrvg 

# vradmin -g diskgroup -verify syncrvg myRvg ....

mikebounds
Level 6
Partner Accredited

verifydata verifies data by taking a spaced optimised snapshot of the volumes which means you don't have umount volumes to verify

 -verify syncrvg verifies the actual data (not a snapshot of the data) and therefore requires you to umount volumes so they are not changing during verify

-c checkpoint syncrvg, synchronies only data that is different - i.e it it effectively verifies if blocks are the same and if they are not then it replicates those blocks that are different (i.e is -a startrep has messed up then "-c checkpoint syncrvg" MAY correct this).

The VVR admin guide says verifydata has "knowledge of file system blocks in use for verification", so should work when "-a startrep" used smartmove, but I cannot find in the VVR admin guide if "-verify syncrvg" also has "knowledge of file system blocks in use for verification", so it it possible that "-verify syncrvg"  verifies ALL blocks, not just "used" filessystem blocks, but I would guess that  "-verify syncrvg" also just verifies  "used" filessystem blocks as the VVR admin guide says:

The commands that use SmartMove during initial synchronization are vradmin syncrvg/syncvol/startrep.

 

So if "syncrvg", when used with "-c checkpoint" for initial synchronization, uses smartmove then you would think that  "syncrvg", when used with "-verify", would be "smartmove" aware and only verify "used" filesystem blocks.

So I was suggesting you use "-c checkpoint syncrvg", but you could try "verifydata", to see if "verifydata" reports volumes are the same.  Note as "-c checkpoint syncrvg" uses smartmove too, if  "-verify syncrvg" IS verifying ALL blocks, then it is still going to report differences after "-c checkpoint syncrvg".

You could also try turning smartmove off:

vxdefault set usefssmartmove none

But then sync is going to take even longer.

As rman is reporting a corrupt block then this suggests that it is a "used" filesystem block that is different, rather than an unused block which means whether the verify is checking unused blocks or not is irrelevent and that something else is wrong which is what I would log a call with Symantec.

Mike

tunisGharbi
Level 4
Partner Accredited

thanks Mike for these clarification

Please Note that i should do zero Initialization of data from PRIMARY TO SECONDARY:

No change was made on primary No I/O on Oracle database.

I havent space for snapshot  capabilities.

the only method for my case is network with file system mount on primary and not mounted on secondary

I will try to start a full synchro then i start data verification!

when I founf  diffrences on volumes I try -c checkpoint

I'm also waiting for you for other details if you have feedback from Symantec Log.

 

Thanks a lot. 

 

mikebounds
Level 6
Partner Accredited

zero Initialization is a good idea when you create volumes as then used blocks will be the same (all zeros) and you in fact, if you start replication before creating filesystem, then you can just start replication with "-f" to say both sides are already synchronised (both all zeros).

Not sure what you mean by "I'm also waiting for you for other details if you have feedback from Symantec Log" - what Log are you referring to?

Mike

tunisGharbi
Level 4
Partner Accredited

Hi Mike 

you have say " I would log a call with Symantec"

i mean that i'm waiting if they have news about this :)

Thanks.

rsharma1
Level 5
Employee Accredited Certified

Please see: http://www.symantec.com/business/support/index?page=content&id=TECH212211

avsrini
Level 4
Employee Accredited Certified

Hi Tunis,

Just want to re-iterate what Mike has mentioned eariler.

If the data volumes  on Primary was mounted during syncrvg -verify operations, then

the contents of those volumes are changed. Thus synrvg will report differences.

 

Below is the excerpt from the manpage of vradmin

 

               Use the -verify option to verify  and  report  the
               data  differences between Primary data volumes and
               the corresponding Secondary  data  volumes.   When
               used with -verify option, the syncrvg command only
               reports the differences between  the  Primary  and
               Secondary  data  volumes;  it does not synchronize
               the Secondary volumes with  the  Primary  volumes.
               An  MD5  checksum is used to calculate the differ-
               ence between the Primary and  the  Secondary  data
               volumes.

               All applications using the  Primary  data  volumes
               must  be  stopped (or quiesced) before running the
               syncrvg command with the -verify option.   At  the
               start  of  the  -verify  syncrvg  command, you are
               asked to confirm that the Primary data volumes are

              not  in  use.   You  can use the -s option to skip

               this confirmation step.

Thanks

Srini

 

 

 

tunisGharbi
Level 4
Partner Accredited

Thanks rsharma1

Thanks Srini

 

Regarding the following informations  i shoud :

1- apply the patch

2- Oracle database and listener should be stopped ( application also) just file system mounted 

3-  apply the sync

 

 

rsharma1
Level 5
Employee Accredited Certified

Agree to srini's post above  (and as pointed by Mike earlier) --retry the 'vradmin -verify syncrvg' with FS unmounted on primary (i'd missed the info that FS was mounted on primary).

Though it is very highly recommended to have this patch installed to avoid this potential corruption issue.


 

tunisGharbi
Level 4
Partner Accredited

Hi All

 

I have apply the path then i have start replication.

when I check the rvg ( radmin -g diskgroup -verify syncrvg myRvg..) the volumes are ok and identical.

Thank for your help.

 

Just one question.

I have configured Cluster GEO between 2 sites, then i have convert my application service group with VVR  ( local hard link). the  global service group is configured to switch manually.

when i switch to remote cluster i have the following error: V-16-5-51068

please your feedback.

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Please start a new discussion for your new issue....