β01-22-2015 05:51 AM
Hi all.
I have a global cluster with 2 minicluster systems with solaris 10 installed (SPARC)
primary: MIVDB01S - 172.22.8.132
secondary: MILDB08S - 10.66.11.148
I stopped the secondary (init 0) for 3 days, and after that I startup the secondary (boot from ok-prompt) and after 1 day I checked for the situation of the replication but:
MIVDB01S
root
vradmin -g datadg printrvg datarvg
Replicated Data Set: datarvg
Primary:
HostName: 172.22.8.132 <localhost>
RvgName: datarvg
DgName: datadg
Secondary:
HostName: 10.66.11.148
RvgName: datarvg
DgName: datadg
vxrlink -g datadg status datarlk
Wed Jan 21 09:51:08 2015
VxVM VVR vxrlink INFO V-5-1-12887 DCM is in use on rlink datarlk. DCM contains 874432 Kbytes (1%) of the Data Volume(s).
vradmin -g datadg repstatus datarvg
Replicated Data Set: datarvg
Primary:
Host name: 172.22.8.132
RVG name: datarvg
DG name: datadg
RVG state: enabled for I/O
Data volumes: 1
VSets: 0
SRL name: srl_vol
SRL size: 1.00 G
Total secondaries: 1
Secondary:
Host name: 10.66.11.148
RVG name: datarvg
DG name: datadg
Data status: consistent, behind
Replication status: paused due to network disconnection (dcm resynchronization)
Current mode: asynchronous
Logging to: DCM (contains 874432 Kbytes) (SRL protection logging)
Timestamp Information: N/A
vxprint -Pl | grep flags
flags: write enabled attached consistent disconnected asynchronous dcm_logging resync_paused
-----
1st solution
MIVDB01S
root
vradmin -g datadg resync datarvg
-----
2nd solution
Stop vradmin on secondary then on primary
# /usr/sbin/vxstart_vvr stop
Start vradmin on secondary then on primary
# /usr/sbin/vxstart_vvr start
Can you help me ?
Solved! Go to Solution.
β01-29-2015 01:07 AM
Hi.
I confirm you that after the activity on our Firewall (the rules were ok, but they didn't work properly), the situation is ok now.
The KSTATE immediately switched from ENABLE to CONNECT and the sync was ok.
Thank you very much for your best support
BR
Tiziano
vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg CONNECT - - ACTIVE - -
vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg CONNECT - - ACTIVE - -
- and then:
vradmin -g datadg repstatus datarvg
Replicated Data Set: datarvg
Primary:
Host name: 172.22.8.132
RVG name: datarvg
DG name: datadg
RVG state: enabled for I/O
Data volumes: 1
VSets: 0
SRL name: srl_vol
SRL size: 1.00 G
Total secondaries: 1
Secondary:
Host name: 10.66.11.148
RVG name: datarvg
DG name: datadg
Data status: consistent, up-to-date
Replication status: replicating (connected)
Current mode: asynchronous
Logging to: SRL
Timestamp Information: behind by 0h 0m 0s
vxrlink -g datadg status datarlk
Thu Jan 29 09:29:02 2015
VxVM VVR vxrlink INFO V-5-1-4639 Rlink datarlk has 1 outstanding write, occupying 33 Kbytes (0%) on the SRL
vxrlink -g datadg status datarlk
Thu Jan 29 09:29:07 2015
VxVM VVR vxrlink INFO V-5-1-4467 Rlink datarlk is up to date
β01-28-2015 02:26 AM
Hi,
One thing for sure is that SRL has overflown so the DCM logging is happening .... DCM synchronization will be required in any case ..
however before that, you need to confirm if network connection is back ...
# vxprint -qthg <diskgroup> | egrep "^rl"
The result should be that rlink is in "CONNECT ACTIVE" state on both primary & secondary.
If above the case, then I would suggest to for vradmin -g datadg resync datarvg, this should start resync of DCM & you can monitor the same in vxrlink status command
however If the rlink is in "ENABLED ACTIVE" state then you would need to start replication again using autosync (vradmin startrep OR vxrlink attach ) which will go for full resync
G
β01-28-2015 05:38 AM
Hi G.
so, because I'm in the following situation:
root@MIVDB01S # vxprint -qthrg datadg | egrep "^rl"
rl datarlk datarvg ENABLED ACTIVE 10.66.11.148 datadg datarlk
root@MILDB08S # vxprint -qthrg datadg | egrep "^rl"
rl datarlk datarvg ENABLED ACTIVE 172.22.8.132 datadg datarlk
where MI1DB01S is the primary
root@MIVDB01S # vradmin -g datadg printrvg datarvg
Replicated Data Set: datarvg
Primary:
HostName: 172.22.8.132 <localhost>
RvgName: datarvg
DgName: datadg
Secondary:
HostName: 10.66.11.148
RvgName: datarvg
DgName: datadg
The correct solutiion should be:
root@MIVDB01S # vradmin -g datadg -a startrep datarvg
I only have a little doubt about the flag (-a ?).
Can you confirm the previous command ?
BR.
Tiziano
β01-28-2015 06:40 AM
Hi,
As the rlinks are not in CONNECT ACTIVE state, that means current replication is broken.
At this point, first troubleshoot if your secondary is reachable from primary ? (a ping test )
Confirm that your /etc/hosts file is correctly reflecting the hostname/IP address mapping or DNS is doing the correct resolution.
Once you find secondary is reachable, use the startrep command (command is right). -a flag is for autosync, it is likely that if -a flag gives any error, run the command without -a which will be a full sync.
Once you trigger a command with -a, wait for few moments & double check if rlink has came in CONNECT ACTIVE state, if yes, start checking # vxrlink -g datadg -i5 status <rlink_to_secondary> to ensure you see reduction in data replicating to secondary.
G
β01-28-2015 07:12 AM
Hi.
root@MIVDB01S # vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg ENABLED - - ACTIVE - -
root@MIVDB01S # vradmin -g datadg startrep datarvg
VxVM VVR vradmin ERROR V-5-52-268 One of the options -a, -c, -f or -b must be used.
VxVM VVR vradmin INFO V-5-52-258
Usage: vradmin [-g diskgroup] {-a | -c checkpoint | -f | -b} startrep rvg [sechost]
root@MIVDB01S # vradmin -g datadg -a startrep datarvg
Message from Primary:
VxVM VVR vxrlink ERROR V-5-1-3531 Rlink datarlk is already attached
root@MIVDB01S # vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg ENABLED - - ACTIVE - -
So I supposed something similar to (from primary):
root@MIVDB01S # vxrlink -g datadg -f det datarlk
root@MIVDB01S # vxrlink -g datadg att datarlk
What do you think ?
BR
Tiziano
β01-28-2015 07:37 AM
Yep, you are right, it seems the status is still attached in kernel, so a detach & attach will required.
This is going to start complete repllication though.
G
β01-29-2015 12:19 AM
Hi.
I noted something strange in this situation, so before to write the last 2 commands (detatch and attach), I decided to verify about some drops on firewall and I had an answer from my collegues that manages the firewall:
we have drop on port 4145 UDP between primary and secondary.
Now they are investigating because I planned a correct rule about that port.
Please wait.
BR
Tiziano
β01-29-2015 01:07 AM
Hi.
I confirm you that after the activity on our Firewall (the rules were ok, but they didn't work properly), the situation is ok now.
The KSTATE immediately switched from ENABLE to CONNECT and the sync was ok.
Thank you very much for your best support
BR
Tiziano
vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg CONNECT - - ACTIVE - -
vxprint -P
Disk group: datadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
rl datarlk datarvg CONNECT - - ACTIVE - -
- and then:
vradmin -g datadg repstatus datarvg
Replicated Data Set: datarvg
Primary:
Host name: 172.22.8.132
RVG name: datarvg
DG name: datadg
RVG state: enabled for I/O
Data volumes: 1
VSets: 0
SRL name: srl_vol
SRL size: 1.00 G
Total secondaries: 1
Secondary:
Host name: 10.66.11.148
RVG name: datarvg
DG name: datadg
Data status: consistent, up-to-date
Replication status: replicating (connected)
Current mode: asynchronous
Logging to: SRL
Timestamp Information: behind by 0h 0m 0s
vxrlink -g datadg status datarlk
Thu Jan 29 09:29:02 2015
VxVM VVR vxrlink INFO V-5-1-4639 Rlink datarlk has 1 outstanding write, occupying 33 Kbytes (0%) on the SRL
vxrlink -g datadg status datarlk
Thu Jan 29 09:29:07 2015
VxVM VVR vxrlink INFO V-5-1-4467 Rlink datarlk is up to date
β01-29-2015 01:22 AM
excellent ...
G