cancel
Showing results for 
Search instead for 
Did you mean: 

VCS with Oracle ASM - ASMDG resource won't go online on one node

jstucki
Level 4

I have a 4-node SFHA 5.1 RP1 cluster on Solaris 10, which uses the Agents for Oracle ASM (ASMInst and ASMDG).  The cluster has been working fine for several months.  I added a new node recently.......and found that the addition of a new node didn't work as documented....should be able to add a node on the fly.....but I had to stop HAD/GAB/LLT and restart them.  But no worries, I got through that part, and all is well.

 

The problem I'm having now is with the ASMDG Agent.  I added a new database recently.  The ASMDG resource for this database will go online OK on 3 of the 4 nodes.  On the 4th node, it won't go online.  I know the Agent files are identical on all 4 nodes, so I don't really suspect that there's a problem with VCS.  I suspect a problem with ASM.....possibly a configuration problem.

 

I'm writing to see if anyone has seen an issue like this before, and learned the cause.  What I'd like to do is dig through the PERL code in the ASMDG Agent, and see if I can figure out the ASM CLI commands that the Agent executes in the 'online' script.  Better yet, if someone can tell me the commands that are executed, it would save me a lot of time.  Once I know the commands, I will try to execute them manually, and watch to see what errors are generated.  Here's the Engine Log output of the event as it occurred.....

 

2011/07/14 20:25:19 VCS NOTICE V-16-1-10301 Initiating Online of Resource db1_dg (Group: DB1_DB_SG) on System fred

2011/07/14 20:25:22 VCS WARNING V-16-10001-1014 (fred) DiskGroup:db1_dg:online:Diskgroups will be imported without reservations

2011/07/14 20:25:27 VCS NOTICE V-16-10001-1009 (fred) DiskGroup:db1_dg:online:vxdg import succeeded on Disk Group db1-dg

2011/07/14 20:25:27 VCS NOTICE V-16-10001-1010 (fred) DiskGroup:db1_dg:online:Volumes in Disk Group db1-dg are started. Any mirrors are updated in background

2011/07/14 20:25:33 VCS INFO V-16-1-10298 Resource db1_dg (Group: DB1_DB_SG) is online on fred (VCS initiated)

2011/07/14 20:26:34 VCS NOTICE V-16-1-10301 Initiating Online of Resource var_opt_oracle_db1 (Group: DB1_DB_SG) on System fred

2011/07/14 20:26:35 VCS INFO V-16-1-10298 Resource var_opt_oracle_db1 (Group: DB1_DB_SG) is online on fred (VCS initiated)

2011/07/14 20:26:41 VCS NOTICE V-16-1-10301 Initiating Online of Resource opt_oracle_db1 (Group: DB1_DB_SG) on System fred

2011/07/14 20:26:42 VCS INFO V-16-1-10298 Resource opt_oracle_db1 (Group: DB1_DB_SG) is online on fred (VCS initiated)

2011/07/14 20:31:08 VCS NOTICE V-16-1-10301 Initiating Online of Resource db1_asmdg (Group: DB1_DB_SG) on System fred

2011/07/14 20:33:11 VCS ERROR V-16-2-13066 (fred) Agent is calling clean for resource (db1_asmdg) because the resource is not up even after online completed.

2011/07/14 20:33:11 VCS NOTICE V-16-20002-256 (fred) ASMDG:db1_asmdg:clean:ASM Diskgroup shutdown returned the output

+--------------------------------------------------------------------+

LD_LIBRARY_PATH  - /usr/lib:/opt/oracl_asm ....blah.....blah....blah

ALTER DISKGROUP DB1_ARCHIVE, DB1_DATA, DB1_REDO DISMOUNT force

*

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15206: duplicate diskgroup DB1_REDO specified

ORA-15206: duplicate diskgroup DB1_DATA specified

ORA-15001: diskgroup "DB1_ARCHIVE" does not exist or is not mounted

+====================================================================+

2011/07/14 20:33:12 VCS INFO V-16-2-13068 (fred) Resource(db1_asmdg) - clean completed successfully.

2011/07/14 20:33:12 VCS INFO V-16-2-13071 (fred) Resource(db1_asmdg): reached OnlineRetryLimit(0).

2011/07/14 20:33:13 VCS ERROR V-16-1-10303 Resource db1_asmdg (Group: DB1_DB_SG) is FAULTED (timed out) on sys fred

2011/07/14 20:37:04 VCS INFO V-16-1-10307 Resource db1_asmdg (Group: DB1_DB_SG) is offline on fred (Not initiated by VCS)

 

Thanks,

-John

1 ACCEPTED SOLUTION

Accepted Solutions

jstucki
Level 4

This appears to be a problem with the 'asm_diskstring' parameter in the Oracle ASM 'init+ASM.ora" file.  The string which is defined for this parameter only allows specific devices to be mounted by ASM, and will not allow the new ASM devices to be mounted.  The DBA is working on a change which will allow the devices (VxVM raw volumes) to be mounted by ASM.

 

I learned the commands which are executed by the ASMDG Agent, when mounting (onlining) and monitoring the devices.  Here are the commands:

Connect to the database using one of the following commands:

Export the environment variables ORACLE_HOME and ORACLE_SID to Oracle home and sid of ASM instance respectively.

(for 11g databases set ORACLE_HOME to grid infrastructure home.)

Then run the following command to connect to the database:

$ sqlplus “[user]/[pword]as sysdba”   (For 9i, 10g databases)

$ sqlplus “[user]/[pword]as sysasm”   (For 11g databases.)

 

([user] and [pword] are optional)

Following commands are used at sqlplus prompt for online and monitor operations:

1.       Online                   :   ALTER DISKGROUP $DiskGroups MOUNT;                     ($diskgroups is the comma-seprated list of diskgroups.)

2.       Monitor               :   select inst.state from v$asm_diskgroup inst where name = '<dg_name1>' [ or name = ‘<dg_name2>’] [or name = ‘<dg_name3>’];

View solution in original post

2 REPLIES 2

joseph_dangelo
Level 6
Employee Accredited

Jstucki,

Can you confirm that the luns associated with the ASM DG's are mapped correctly to the 4th node?

you can run the following to verify:

#> fcinfo hba-port | grep "HBA Port WWN:"

#> fcinfo remote-port -sl -p "the WWN output from the above command"

Assuming that there aren't any dedicated devices, you should see the exact same output for all 4 nodes (in terms of lun count)

Hope this helps,

Joe D

jstucki
Level 4

This appears to be a problem with the 'asm_diskstring' parameter in the Oracle ASM 'init+ASM.ora" file.  The string which is defined for this parameter only allows specific devices to be mounted by ASM, and will not allow the new ASM devices to be mounted.  The DBA is working on a change which will allow the devices (VxVM raw volumes) to be mounted by ASM.

 

I learned the commands which are executed by the ASMDG Agent, when mounting (onlining) and monitoring the devices.  Here are the commands:

Connect to the database using one of the following commands:

Export the environment variables ORACLE_HOME and ORACLE_SID to Oracle home and sid of ASM instance respectively.

(for 11g databases set ORACLE_HOME to grid infrastructure home.)

Then run the following command to connect to the database:

$ sqlplus “[user]/[pword]as sysdba”   (For 9i, 10g databases)

$ sqlplus “[user]/[pword]as sysasm”   (For 11g databases.)

 

([user] and [pword] are optional)

Following commands are used at sqlplus prompt for online and monitor operations:

1.       Online                   :   ALTER DISKGROUP $DiskGroups MOUNT;                     ($diskgroups is the comma-seprated list of diskgroups.)

2.       Monitor               :   select inst.state from v$asm_diskgroup inst where name = '<dg_name1>' [ or name = ‘<dg_name2>’] [or name = ‘<dg_name3>’];