cancel
Showing results for 
Search instead for 
Did you mean: 

VVR -DCM log fails to completely drain

Marius_Gordon
Level 4
Certified

Hi 

My replication runs fine for an day or two. Then replication stops. 

DCM log stuck on 2% and grows. Article:TECH55550 it’s because of an SRL overflow.

To resolve we need to disable access to the volume. We cannot do this every second day.

This is a very critical system, we cannot afford any down time.

Can it be that VVR is not suitable for our environment? 

How can I determine the best Data transfer solution between production DC and DR DC?

Using SAN mirroring is an option, but how will it handle regions that gets changed all the time?

Our environment setup looks like this:

10GB link between DC's

DB size 1.6TB

Data change daily +- 300GB

Thanks

Marius

1 ACCEPTED SOLUTION

Accepted Solutions

Wally_Heim
Level 6
Employee

Hi Marius,

There are some DCM replication issues resolved in the latest CPs for 5.1 SP2.  I think they are included in the 6.x product line but it would still be good to upgrade to the latest CP for the version that you are using.

If your 10 GB network link is not being utilized fully, you might need to open a case with Symantec Technical Support to see if we can tune it for better throughput.  The windows product does have some limits on the bandwidth when running in TCP mode.  The newer versions of the product allow for higher throughput when using TCP mode.  When using UDP mode there are some tunables that we can do help with throughput depending on the replication stats that your environment is showing.

You might also want to see if enabling Compression will help with this situation.  But again, I would recommend that you upgrade to the latest SP and CP if you are running the 5.1 product.

Thank you,

Wally

View solution in original post

2 REPLIES 2

mikebounds
Level 6
Partner Accredited

300GB is a lot of data changes if your database is 1.6TB - this is nearly 20%, so must be mostly changes rather than additional rows in the database and there I would guess that you are replicating the temporary database area and this is generating most of your writes.  If your temporay spaces are separated out on a separate volume, you can use vxstat to see what you are writing per volume, and you in any case if you use vxstat to collect data, you can use vradvisor to analyse data to tell you what bandwidth you require.
If there are a lot of writes to temp, thn you should put temp database spaces on a sepatate volume and exclude from VVR.

Some simple calculations are below:

  Day Hour Min Sec
GB 300 12.5 0.20833 0.00347
MB 307200 12800 213.333 3.55556
Gbits 2400 100 1.66667 0.02778
Mbits 2457600 102400 1706.67 28.4444

So this shows you only need on AVERAGE 28.4 MBits/s - it will be a bit more than this as you need to account for SRL headers, but even if you only have 100Mbit NICs, this is more than enough for average writes, so even if VVR gets behind (SRL starts to fill) during peak writes (vradvisor tool will show you this), your SRL should not get full.

You need to monitor your SRL to see what is happening - use vrlink status for this.

If your are not familiar using vxstat and vxrlink, then you can use something like:

vxstat -g diskgroup -i 60

 

to show statistics every minute - the first stat will be I/O since boot time and then subsequent stats will be additional I/O every minute and you should see that writes to SRL volumes are slightly more than total of writes to all other volumes.  Writes are shown in Blocks so multiple by 2048 to get MBytes.

For SRL/DCM usage use:

vxrlink -g diskgroup status rlink_name -i 60 (use vxprint -P to get rlink_name)

 

In terms of DRL not draining, this was fixed for UNIX in 5.1, so I would have though it would be fixed in Windows too, what version of SFW are you using.

If you sites are close together and SAN connected are latency is acceptable for your app, then SAN mirroring is a better solution.

Mike

 

Wally_Heim
Level 6
Employee

Hi Marius,

There are some DCM replication issues resolved in the latest CPs for 5.1 SP2.  I think they are included in the 6.x product line but it would still be good to upgrade to the latest CP for the version that you are using.

If your 10 GB network link is not being utilized fully, you might need to open a case with Symantec Technical Support to see if we can tune it for better throughput.  The windows product does have some limits on the bandwidth when running in TCP mode.  The newer versions of the product allow for higher throughput when using TCP mode.  When using UDP mode there are some tunables that we can do help with throughput depending on the replication stats that your environment is showing.

You might also want to see if enabling Compression will help with this situation.  But again, I would recommend that you upgrade to the latest SP and CP if you are running the 5.1 product.

Thank you,

Wally