cancel
Showing results for 
Search instead for 
Did you mean: 

VVR paused due to network disconnection

tgenova
Level 4

Hi all.

I have a global cluster with 2 minicluster systems with solaris 10 installed (SPARC)

primary: MIVDB01S - 172.22.8.132

secondary: MILDB08S - 10.66.11.148               

I stopped the secondary (init 0) for 3 days, and after that I startup the secondary (boot from ok-prompt) and after 1 day I checked for the situation of the replication but:

MIVDB01S
root
  vradmin -g datadg printrvg datarvg
    Replicated Data Set: datarvg
    Primary:
        HostName: 172.22.8.132  <localhost>
        RvgName: datarvg
        DgName: datadg
    Secondary:
        HostName: 10.66.11.148
        RvgName: datarvg
        DgName: datadg

  vxrlink -g datadg status datarlk
    Wed Jan 21 09:51:08 2015
    VxVM VVR vxrlink INFO V-5-1-12887 DCM is in use on rlink datarlk. DCM contains 874432 Kbytes (1%) of the Data Volume(s).

  vradmin -g datadg repstatus datarvg 
    Replicated Data Set: datarvg
    Primary:
      Host name:                  172.22.8.132
      RVG name:                   datarvg
      DG name:                    datadg
      RVG state:                  enabled for I/O
      Data volumes:               1
      VSets:                      0
      SRL name:                   srl_vol
      SRL size:                   1.00 G
      Total secondaries:          1

    Secondary:
      Host name:                  10.66.11.148
      RVG name:                   datarvg
      DG name:                    datadg
      Data status:                consistent, behind
      Replication status:         paused due to network disconnection (dcm resynchronization)
      Current mode:               asynchronous
      Logging to:                 DCM (contains 874432 Kbytes) (SRL protection logging)
      Timestamp Information:      N/A

  vxprint -Pl | grep flags
    flags:    write enabled attached consistent disconnected asynchronous dcm_logging resync_paused

-----

1st solution

 

MIVDB01S

root

vradmin -g datadg resync datarvg

-----

2nd solution

Stop vradmin on secondary then on primary

# /usr/sbin/vxstart_vvr stop

Start vradmin on secondary then on primary

# /usr/sbin/vxstart_vvr start

Can you help me ?

 

1 ACCEPTED SOLUTION

Accepted Solutions

tgenova
Level 4

Hi.

I confirm you that after the activity on our Firewall (the rules were ok, but they didn't work properly), the situation is ok now.

The KSTATE immediately switched from ENABLE to CONNECT and the sync was ok.

Thank you very much for your best support

BR

Tiziano

 

 

  vxprint -P
    Disk group: datadg

    TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
    rl datarlk      datarvg      CONNECT  -        -        ACTIVE   -       -

  vxprint -P
    Disk group: datadg

    TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
    rl datarlk      datarvg      CONNECT  -        -        ACTIVE   -       -

- and then:

  vradmin -g datadg repstatus datarvg
    Replicated Data Set: datarvg
    Primary:
      Host name:                  172.22.8.132
      RVG name:                   datarvg
      DG name:                    datadg
      RVG state:                  enabled for I/O
      Data volumes:               1
      VSets:                      0
      SRL name:                   srl_vol
      SRL size:                   1.00 G
      Total secondaries:          1

    Secondary:
      Host name:                  10.66.11.148
      RVG name:                   datarvg
      DG name:                    datadg
      Data status:                consistent, up-to-date
      Replication status:         replicating (connected)
      Current mode:               asynchronous
      Logging to:                 SRL
      Timestamp Information:      behind by 0h 0m 0s

  vxrlink -g datadg status datarlk
    Thu Jan 29 09:29:02 2015
    VxVM VVR vxrlink INFO V-5-1-4639 Rlink datarlk has 1 outstanding write, occupying 33 Kbytes (0%) on the SRL

  vxrlink -g datadg status datarlk
    Thu Jan 29 09:29:07 2015
    VxVM VVR vxrlink INFO V-5-1-4467 Rlink datarlk is up to date

 

View solution in original post

8 REPLIES 8

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi,

One thing for sure is that SRL has overflown so the DCM logging is happening .... DCM synchronization will be required in any case ..

however before that, you need to confirm if network connection is back ...

# vxprint -qthg <diskgroup> | egrep "^rl"

The result should be that rlink is in "CONNECT ACTIVE" state on both primary & secondary.

If above the case, then I would suggest to for vradmin -g datadg resync datarvg, this should start resync of DCM & you can monitor the same in vxrlink status command

however If the rlink is in "ENABLED ACTIVE" state then you would need to start replication again using autosync (vradmin startrep OR vxrlink attach ) which will go for full resync

 

G

tgenova
Level 4

Hi G.

so, because I'm in the following situation:

root@MIVDB01S # vxprint -qthrg datadg | egrep "^rl"
  rl datarlk      datarvg      ENABLED  ACTIVE   10.66.11.148 datadg datarlk

root@MILDB08S # vxprint -qthrg datadg | egrep "^rl"
  rl datarlk      datarvg      ENABLED  ACTIVE   172.22.8.132 datadg datarlk

where MI1DB01S is the primary

root@MIVDB01S # vradmin -g datadg printrvg datarvg
Replicated Data Set: datarvg
Primary:
        HostName: 172.22.8.132  <localhost>
        RvgName: datarvg
        DgName: datadg
Secondary:
        HostName: 10.66.11.148
        RvgName: datarvg
        DgName: datadg

The correct solutiion should be:

root@MIVDB01S # vradmin -g datadg -a startrep datarvg

I only have a little doubt about the flag (-a ?).

Can you confirm the previous command ?

 

BR.

Tiziano

 

 

 

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi,

As the rlinks are not in CONNECT ACTIVE state, that means current replication is broken.

At this point, first troubleshoot if your secondary is reachable from primary ? (a ping test )

Confirm that your /etc/hosts file is correctly reflecting the hostname/IP address mapping or DNS is doing the correct resolution.

Once you find secondary is reachable, use the startrep command (command is right). -a flag is for autosync, it is likely that if -a flag gives any error, run the command without -a which will be a full sync.

Once you trigger a command with -a, wait for few moments & double check if rlink has came in CONNECT ACTIVE state, if yes, start checking # vxrlink -g datadg -i5 status <rlink_to_secondary>      to ensure you see reduction in data replicating to secondary.

 

G

tgenova
Level 4

Hi.

 

root@MIVDB01S # vxprint -P
Disk group: datadg

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
rl datarlk      datarvg      ENABLED  -        -        ACTIVE   -       -
 

root@MIVDB01S # vradmin -g datadg startrep datarvg
VxVM VVR vradmin ERROR V-5-52-268 One of the options -a, -c, -f or -b must be used.
VxVM VVR vradmin INFO V-5-52-258
    Usage: vradmin [-g diskgroup] {-a | -c checkpoint | -f | -b} startrep rvg [sechost]
 

root@MIVDB01S # vradmin -g datadg -a startrep datarvg
Message from Primary:
VxVM VVR vxrlink ERROR V-5-1-3531 Rlink datarlk is already attached


root@MIVDB01S # vxprint -P
Disk group: datadg

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
rl datarlk      datarvg      ENABLED  -        -        ACTIVE   -       -

So I supposed something similar to (from primary):

root@MIVDB01S # vxrlink -g datadg -f det datarlk    

root@MIVDB01S # vxrlink -g datadg att datarlk   

What do you think ?

BR

Tiziano

Gaurav_S
Moderator
Moderator
   VIP    Certified

Yep, you are right, it seems the status is still attached in kernel, so a detach & attach will required.

This is going to start complete repllication though.

G

tgenova
Level 4

Hi.

I noted something strange in this situation, so before to write the last 2 commands (detatch and attach), I decided to verify about some drops on firewall and I had an answer from my collegues that manages the firewall:

we have drop on port 4145 UDP between primary and secondary.

Now they are investigating because I planned a correct rule about that port.

Please wait.

BR

Tiziano

 

tgenova
Level 4

Hi.

I confirm you that after the activity on our Firewall (the rules were ok, but they didn't work properly), the situation is ok now.

The KSTATE immediately switched from ENABLE to CONNECT and the sync was ok.

Thank you very much for your best support

BR

Tiziano

 

 

  vxprint -P
    Disk group: datadg

    TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
    rl datarlk      datarvg      CONNECT  -        -        ACTIVE   -       -

  vxprint -P
    Disk group: datadg

    TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
    rl datarlk      datarvg      CONNECT  -        -        ACTIVE   -       -

- and then:

  vradmin -g datadg repstatus datarvg
    Replicated Data Set: datarvg
    Primary:
      Host name:                  172.22.8.132
      RVG name:                   datarvg
      DG name:                    datadg
      RVG state:                  enabled for I/O
      Data volumes:               1
      VSets:                      0
      SRL name:                   srl_vol
      SRL size:                   1.00 G
      Total secondaries:          1

    Secondary:
      Host name:                  10.66.11.148
      RVG name:                   datarvg
      DG name:                    datadg
      Data status:                consistent, up-to-date
      Replication status:         replicating (connected)
      Current mode:               asynchronous
      Logging to:                 SRL
      Timestamp Information:      behind by 0h 0m 0s

  vxrlink -g datadg status datarlk
    Thu Jan 29 09:29:02 2015
    VxVM VVR vxrlink INFO V-5-1-4639 Rlink datarlk has 1 outstanding write, occupying 33 Kbytes (0%) on the SRL

  vxrlink -g datadg status datarlk
    Thu Jan 29 09:29:07 2015
    VxVM VVR vxrlink INFO V-5-1-4467 Rlink datarlk is up to date

 

View solution in original post

Gaurav_S
Moderator
Moderator
   VIP    Certified

excellent ...

 

G