cancel
Showing results for 
Search instead for 
Did you mean: 

VCS VVR RLINK IN RECOVER STALE STATE

Ayes
Level 4

I have a VCS VVR iisue below and I have tried recovering the rvg and rattaching the rlink but not working :see below:

root@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O (passthru)
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Secondary:
  Host name:                  10.1.110.132
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  Rlink from Primary:         rlk_10.1.110.132_rvg_ojota
  Rlink to Primary:           rlk_10.1.230.172_rvg_ojota
  Configured mode:            asynchronous
  Latency protection:         off
  SRL protection:             autodcm
  Data status:                consistent, stale
  Replication status:         not replicating (primary needs recovery)
  Current mode:               N/A
  Logging to:                 N/A
  Timestamp Information:      N/A
  Bandwidth Limit:            N/A
  Compression Mode:           Off
 

1 ACCEPTED SOLUTION

Accepted Solutions

Ayes
Level 4

Many thanks,

 

I have frozen the service group with the catalog_dg and imported it manually .so VVR has started as below and the cluster has not deported it for 5mins now,still monitoring:

 

 

oot@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Secondary:
  Host name:                  10.1.110.132
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  Rlink from Primary:         rlk_10.1.110.132_rvg_ojota
  Rlink to Primary:           rlk_10.1.230.172_rvg_ojota
  Configured mode:            asynchronous
  Latency protection:         off
  SRL protection:             autodcm
  Data status:                inconsistent
  Replication status:         resync in progress (smartsync autosync)
  Current mode:               asynchronous
  Logging to:                 DCM (contains  0  Kbytes) (autosync)
  Timestamp Information:      N/A
  Bandwidth Limit:            N/A
  Compression Mode:           Off
 
 
Thanks for all.
Guess uar an ASC as your skill so far had proven to me beyond reasonable doubt 

View solution in original post

15 REPLIES 15

mikebounds
Level 6
Partner Accredited

Your output shows the RVG State is in PassThru mode so you need to disassociated and re-associated the SRL - see extract from VVR admin guide:

 

 

When a Primary SRL header error occurs, writes to the RVG continue; however,
all RLINKs are put in the STALE state. The RVG is operating in PASSTHRU mode.
 
To recover from an SRL header error

 

 

1 Stop the RVG.
# vxrvg -g hrdg stop hr_rvg
 
2 Dissociate the SRL from the RVG.
# vxvol -g hrdg dis hr_srl
 
3 Repair or restore the SRL. Even if the problem can be fixed by repairing the
underlying subdisks, the SRL must still be dissociated and reassociated to
initialize the SRL header.
 
4 Make sure the SRL is started, and then reassociate the SRL:
# vxvol -g hrdg start hr_srl
# vxvol -g hrdg aslog hr_rvg hr_srl
 
5 Start the RVG:
# vxrvg -g hrdg start hr_rvg
 
6 Restore the data volumes from backup if needed. Synchronize all the RLINKs.
 
Mike

Ayes
Level 4

Thanks Mike.

 

Can I do this onone dynamically

what is the impacts pls

 

Ayes
Level 4

But ther eis not device error on the rvg   rvg and srl

see vxprint and vxinfi status below:

 

 

root@ojnbu2 # vxprint -ht
Disk group: catalog_dg
 
DG NAME         NCONFIG      NLOG     MINORS   GROUP-ID
ST NAME         STATE        DM_CNT   SPARE_CNT         APPVOL_CNT
DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   STATE
RV NAME         RLINK_CNT    KSTATE   STATE    PRIMARY  DATAVOLS  SRL
RL NAME         RVG          KSTATE   STATE    REM_HOST REM_DG    REM_RLNK
CO NAME         CACHEVOL     KSTATE   STATE
VT NAME         RVG          KSTATE   STATE    NVOLUME
V  NAME         RVG/VSET/CO  KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE
SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
SV NAME         PLEX         VOLNAME  NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE
SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
DC NAME         PARENTVOL    LOGVOL
SP NAME         SNAPVOL      DCO
EX NAME         ASSOC        VC                       PERMS    MODE     STATE
SR NAME         KSTATE
 
dg catalog_dg   default      default  5000     1362131292.32.ojc2b7c
 
dm catalog_dg01 hitachi_vsp0_049d auto 65536   207172704 -
dm catalog_dg02 hitachi_vsp0_049f auto 65536   207172704 -
dm catalog_dg03 hitachi_vsp0_04f6 auto 65536   207294208 -
dm catalog_dg04 hitachi_vsp0_04f7 auto 65536   207294208 -
dm catalog_dg05 hitachi_vsp0_04f8 auto 65536   207294208 -
dm catalog_dg06 hitachi_vsp0_0100 auto 65536   207294208 -
 
rv rvg_ojota    1            ENABLED  ACTIVE   primary  1         srlvol
rl rlk_10.1.110.132_rvg_ojota rvg_ojota RECOVER STALE 10.1.110.132 catalog_dg rlk_10.1.230.172_rvg_ojota
v  catvol       rvg_ojota    ENABLED  ACTIVE   928944128 SELECT   -        fsgen
pl catvol-01    catvol       ENABLED  ACTIVE   928944128 CONCAT   -        RW
sd catalog_dg03-01 catvol-01 catalog_dg03 0    207294208 0        hitachi_vsp0_04f6 ENA
sd catalog_dg04-01 catvol-01 catalog_dg04 0    107278592 207294208 hitachi_vsp0_04f7 ENA
sd catalog_dg02-01 catvol-01 catalog_dg02 0    207172704 314572800 hitachi_vsp0_049f ENA
sd catalog_dg04-03 catvol-01 catalog_dg04 107279104 100015104 521745504 hitachi_vsp0_04f7 ENA
sd catalog_dg01-03 catvol-01 catalog_dg01 107279104 99893600 621760608 hitachi_vsp0_049d ENA
sd catalog_dg06-01 catvol-01 catalog_dg06 0    207289920 721654208 hitachi_vsp0_0100 ENA
pl catvol-02    catvol       ENABLED  ACTIVE   LOGONLY  CONCAT    -        RW
sd catalog_dg01-02 catvol-02 catalog_dg01 107278592 512 LOG       hitachi_vsp0_049d ENA
pl catvol-03    catvol       ENABLED  ACTIVE   LOGONLY  CONCAT    -        RW
sd catalog_dg04-02 catvol-03 catalog_dg04 107278592 512 LOG       hitachi_vsp0_04f7 ENA
v  srlvol       rvg_ojota    ENABLED  ACTIVE   314572800 SELECT   -        SRL
pl srlvol-01    srlvol       ENABLED  ACTIVE   314572800 CONCAT   -        RW
sd catalog_dg05-01 srlvol-01 catalog_dg05 0    207294208 0        hitachi_vsp0_04f8 ENA
sd catalog_dg01-01 srlvol-01 catalog_dg01 0    107278592 207294208 hitachi_vsp0_049d ENA
root@ojnbu2 # vxingo -g catalog_dg
bash: vxingo: command not found
root@ojnbu2 # vxinfo -g catalog_dg
catvol         fsgen    Started
srlvol         fsgen    Started
 

Ayes
Level 4

So how am I goint o repair if the status of the srl and rvg are ok as u said below:

 

3 Repair or restore the SRL. Even if the problem can be fixed by repairing the

underlying subdisks, the SRL must still be dissociated and reassociated to

initialize the SRL header.

mikebounds
Level 6
Partner Accredited

The RVG state being in PASSTHRU means there has been a problem accessing the SRL at some point, but  looks ok now, so you just need to follow steps to re-initialise the SRL header.

I THINK you can't stop the RVG online, which is a bit rubbish, if I'm right as you can normally delete the RVG and rlink online so you may end having to do this abd re-create RDS.

Mike

Ayes
Level 4

This implies they will not be down time.The configuration is a GCO with VVR of netbackup catalogue from prod (one syte) to DR (Romote site).

That is to say after recreating the rvg and srl ,replication has to be started.

Correct me if I am wrong.

I am gratefull 

mikebounds
Level 6
Partner Accredited

You should be able to do this without downtime.  You can see if you can disasscoiate SRL without stopping RVG as I think you can do this if RVG is in PASSTHRU mode, but if you are having issues disasscoiating SRL while volumes are mounted you can delete RDS and re-create it online.

You should freeze all service groups in VCS while you are doing this work.

What ever method you use, you will need to restart replication.

Mike

Ayes
Level 4

Hello House.

I have execute the POA below but still the same issue:

 

 

1.vxvol -g catalog_dg -f dis srlvol

2.vxvol -g catalog_dg aslog rvg_ojota srlvol

3.vradmin -g catalog_dg -a startrep rvg_ojota

 

After these the VVR status still same as above.see below:

 

oot@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O (passthru)
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Secondary:
  Host name:                  10.1.110.132
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  Rlink from Primary:         rlk_10.1.110.132_rvg_ojota
  Rlink to Primary:           rlk_10.1.230.172_rvg_ojota
  Configured mode:            asynchronous
  Latency protection:         off
  SRL protection:             autodcm
  Data status:                consistent, stale
  Replication status:         not replicating (primary needs recovery)
  Current mode:               N/A
  Logging to:                 N/A
  Timestamp Information:      N/A
  Bandwidth Limit:            N/A
  Compression Mode:           Off
 

 

pls advice

mikebounds
Level 6
Partner Accredited

When did the RVG state show passthru - was this after you associated SRL volume or did it change to passthru when you started replication.

Were the rlinks detached before you ran "vxvol -g catalog_dg aslog rvg_ojota srlvol" - i.e had you run "vradmin stoprep" or "vxrlink det".  When you run "vxrlink startrep" this attaches rlinks, so I would think they need to be detached first.

Mike

Ayes
Level 4

Sorry it is after starting replication the passtjhru has stoppedas below .

I thing I need to recover now:

 

 

root@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Secondary:
  Host name:                  10.1.110.132
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  Rlink from Primary:         rlk_10.1.110.132_rvg_ojota
  Rlink to Primary:           rlk_10.1.230.172_rvg_ojota
  Configured mode:            asynchronous
  Latency protection:         off
  SRL protection:             autodcm
  Data status:                consistent, stale
  Replication status:         not replicating (primary needs recovery)
  Current mode:               N/A
  Logging to:                 SRL
  Timestamp Information:      N/A
  Bandwidth Limit:            N/A
  Compression Mode:           Off
 

mikebounds
Level 6
Partner Accredited

Yes hopefully a vxrvg or vxrlink recover will now fix issue

Mike

Ayes
Level 4

It fixed it but ther eis another issue resurfacing:

 

The catalog_dg in the secidary host gets deported and throws the error below:

I imported the diskgroup being replicated manually and the replicateion will start but after few mins it stops again  because the diskgroup gets deported again.

See below:

 

 

 

 

root@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Config Errors:
 10.1.110.132:               disk group missing
 
 
Also on the secondary host see the below logs:
 
root@iknbu3 # tail -f /var/VRTSvcs/log/engine_A.log
 
==============================================
 
2013/09/24 13:12:13 VCS WARNING V-16-10001-1038 (iknbu3) DiskGroup:nbu_dg:monitor:Disk attribute 'autoimport' is set to 'yes' for the disk group catalog_dg. Setting it to 'no'.
2013/09/24 13:12:13 VCS ERROR V-16-10001-1036 (iknbu3) DiskGroup:nbu_dg:monitor:No SCSI3 reservations found on diskgroup catalog_dg
2013/09/24 13:12:14 VCS WARNING V-16-10001-1032 (iknbu3) DiskGroup:nbu_dg:monitor:Offlining Group nbu_group to which DiskGroup resource catalog_dg belongs
2013/09/24 13:12:14 VCS INFO V-16-1-50135 User root fired command: hagrp -offline -propagate nbu_group  iknbu3  from localhost
2013/09/24 13:12:14 VCS INFO V-16-1-10299 Resource nbu_dg (Owner: Unspecified, Group: nbu_group) is online on iknbu3 (Not initiated by VCS)
2013/09/24 13:12:14 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group nbu_group on all nodes
2013/09/24 13:12:14 VCS ERROR V-16-1-50921 CONCURRENCY VIOLATION:Group nbu_group is online on the following clusters [ojnbuclu, iknbucl
 
advice pls
 

Ayes
Level 4

How do we stop the cluster from deporting this catalog_dg after importing it manually may setting the attributes to maual outside of the cluster

mikebounds
Level 6
Partner Accredited

To stop VCS taking action on resources freeze the service group.

Mike

Ayes
Level 4

Many thanks,

 

I have frozen the service group with the catalog_dg and imported it manually .so VVR has started as below and the cluster has not deported it for 5mins now,still monitoring:

 

 

oot@ojnbu2 # vradmin -g catalog_dg -l repstatus rvg_ojota
Replicated Data Set: rvg_ojota
Primary:
  Host name:                  10.1.230.172
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  RVG state:                  enabled for I/O
  Data volumes:               1
  VSets:                      0
  SRL name:                   srlvol
  SRL size:                   150.00 G
  Total secondaries:          1
Secondary:
  Host name:                  10.1.110.132
  RVG name:                   rvg_ojota
  DG name:                    catalog_dg
  Rlink from Primary:         rlk_10.1.110.132_rvg_ojota
  Rlink to Primary:           rlk_10.1.230.172_rvg_ojota
  Configured mode:            asynchronous
  Latency protection:         off
  SRL protection:             autodcm
  Data status:                inconsistent
  Replication status:         resync in progress (smartsync autosync)
  Current mode:               asynchronous
  Logging to:                 DCM (contains  0  Kbytes) (autosync)
  Timestamp Information:      N/A
  Bandwidth Limit:            N/A
  Compression Mode:           Off
 
 
Thanks for all.
Guess uar an ASC as your skill so far had proven to me beyond reasonable doubt