Forum Discussion

Nimish_Agarwal's avatar
12 years ago

VCS Error Codes for all platforms

Hello Gents,

Do we have a list of all erro codes for VCS ?

also if all the error codes are generic and are common for all platforms (including Linux,Solaris, Windows,AIX)

Need this confirmation urgently, planning to design a common monitoring agent.

Best Regards,

Nimish

  • You would hope that one error code can't mean different things on different platforms, but the reverse is not true - i.e the same error condition can have different codes on different platforms and in fact if you look at /opt/VRTSvcs/bin/ag_i18n_inc.sh you see:

     

    case ${SYSTEM} in
        AIX)
            VCS_LOG_CATEGORY=10011;;
        HP-UX)
            VCS_LOG_CATEGORY=10021;;
        Linux)
            VCS_LOG_CATEGORY=10031;;
        SunOS)
            PATH=/usr/xpg4/bin:$PATH; # for correct grep
            VCS_LOG_CATEGORY=10001;;
    esac
     
    And therefore for example if you search for "No disk group name specified in the resource" at https://sort.symantec.com/ecls you get:
     
    UMI Message Severity Component
    V-16-10011-9503 No disk group name specified in the resource definition... n/a Veritas Cluster Server
    V-16-10001-11504 No disk group name specified in the resource definition... n/a Veritas Cluster Server
    V-16-10031-12511 No disk group name specified in the resource definition... n/a Veritas Cluster Server
     
     
    Note not only is VCS_LOG_CATEGORY different, but the last part of the code is different in all cases and there is no match for HP-ux, so HP-ux may not have this error code, or it might use different wording.
     
    I have had several customers ask for a list of codes and I was never able to get this list.  Clearly this list exists (even if not complete) at https://sort.symantec.com/ecls, but it seems you can only get 100 codes at a time and there is not even a "next" to list the next 100 codes, so I guess there is no way for an end user to get a list of codes held at https://sort.symantec.com/ecls and even if you could it would require a lot of work to correlate all the different error codes as for instance if you search for "No disk group specified" at https://sort.symantec.com/ecls you get:
     
    UMI Message Severity Component
    V-16-10031-1531 No disk group specified. n/a Veritas Cluster Server
    V-16-10011-916 No disk group specified n/a Veritas Cluster Server
     
    So I would guess this means the same thing and if I search for "No disk group" in /opt/VRTSvcs/bin/DiskGroup on VCS 5.1 on Linux then I get:
    # grep "No disk group" *
    monitor:    VCSAG_LOG_MSG "E" "No disk group specified." 1531
     
    So this looks at I ONLY have error code V-16-10031-1531 on VCS 5.1 on Linux so it maybe that you get error code V-16-10031-12511 on another version of VCS on Linux.
     
    Mike

     

  • Hi All,

    The UMI message IDs would be different for VCS bundled agents on various platforms for a given message  as Mike said. The Category ID is different on each platform for bundled agents and for each category ID the message numbers starts from 1. 

    Message format: V-<ProductID>-<CategoryID>-<MessageID>.

    While the UMI would be different on each platform for a message, it is guranteed that the UMI code remains same for a message across VCS releases.

    For example, if a message "xyz" has a UMI V-16-10001-1 in VCS 5.1 release on Solaris, the message UMI for the message would remain as V-16-10001-1 in VCS 6.0 release. If any change made to the message "xyz" say "xyzw" then a different UMI would be used.

     

    Thanks,

    Venkat

     

  • at a loss as to what you're trying to achieve/what "error codes" you're looking for ....

    if you really wanted to be thorough, you could go through every agent entry point (and decompile every binary) - for every platform and check the various codes being returned. (not forgetting any custom agents that could be configured)

    or you could at look here:

    https://sort.symantec.com/ecls

    and use lookup terms such as "vcs", "llt", "gab" as a start and filter on component Veritas Cluster Server (although this still won't be a comprehensive/complete list as not every error contains those terms).

    or a slightly more sensible approach might be to actually look at the errors you're actually getting in the logs, and use that as a starting point.

    OR if you're just looking for the log/message syntax/format, you could look at the documentation and check, eg:

    VCS 6.0.1 AIX Administrator's Guide -> Troubleshooting and recovery for VCS -> VCS message logging:

    https://sort.symantec.com/public/documents/sfha/6.0.1/aix/productguides/html/vcs_admin/ch22s01.htm

    The format of engine log messages is:
    
    Timestamp (Year/MM/DD) | Mnemonic | Severity | UMI | Message Text
    
    • Timestamp: the date and time the message was generated.
    • Mnemonic: the string ID that represents the product (for example, VCS).
    • Severity: levels include CRITICAL, ERROR, WARNING, NOTICE, and INFO (most to least severe, respectively).
    • UMI: a unique message ID.
    • Message Text: the actual message generated by VCS.
    
    A typical engine log resembles:
    
    2011/07/10 16:08:09 VCS INFO V-16-1-10077 Received new cluster membership
    
    The agent log is located at /var/VRTSvcs/log/<agent>.log. The format of agent log messages resembles:
    
    Timestamp (Year/MM/DD) | Mnemonic | Severity | UMI | Agent Type | Resource Name | Entry Point | Message Text
    
    A typical agent log resembles:
    
    2011/07/10 10:38:23 VCS WARNING V-16-2-23331 Oracle:VRT:monitor:Open for ora_lgwr failed, setting cookie to null.

    and then the agent could be written to filter on fields such as severity, et al.

  • Thank you for the vital information  enlightened

    Agree that need to compile every agent specific to platform.

    But this is a lengthy approach :-( .

    What we are trying to confirm is, when these bundled agents (for all platforms) are designed then the approach of common error code for all platform is taken care(I hope it is) ? 

    Further elaborating my query:

    We are trying to write a script to monitor the engine logs and parsing on the basis of specific error codes.

    e.g.:

    GabHandle::open failed errno = V-16-1-10116

    here we are putting a filter on error code "V-16-1-10116" so the meaning of this error code is same in all platforms ? So that our parsing (for V-16-1-10116) will work on RHEL,Solaris,WIndows,AIX.

    Or we have a different meaning of this error code in different platform.

    Note: "V-16-1-10116" is a example it can be any error code.

    Best Regards,

    Nimish

  • On the detailed page for the example given:

    https://sort.symantec.com/ecls/umi/V-16-1-10116

    Platform: Generic

    So that implies that code would be the same for all platforms.

    Picking UMIs at random from the page - they do appear to be fairly generic unless the message is something specific to a platform (eg: if a config is specific Solaris/AIX, etc then that UMI would only apply to that platform since it wouldn't occur on others)

    If you know the error codes you're interested in, check them on the lookup site to confirm (ie: check the UMI, then check the string to see if there are additional entries/codes for the same string) - bear in mind many do not have detailed explanations yet so you may not be able to confirm platform(s) for all.

  • You would hope that one error code can't mean different things on different platforms, but the reverse is not true - i.e the same error condition can have different codes on different platforms and in fact if you look at /opt/VRTSvcs/bin/ag_i18n_inc.sh you see:

     

    case ${SYSTEM} in
        AIX)
            VCS_LOG_CATEGORY=10011;;
        HP-UX)
            VCS_LOG_CATEGORY=10021;;
        Linux)
            VCS_LOG_CATEGORY=10031;;
        SunOS)
            PATH=/usr/xpg4/bin:$PATH; # for correct grep
            VCS_LOG_CATEGORY=10001;;
    esac
     
    And therefore for example if you search for "No disk group name specified in the resource" at https://sort.symantec.com/ecls you get:
     
    UMI Message Severity Component
    V-16-10011-9503 No disk group name specified in the resource definition... n/a Veritas Cluster Server
    V-16-10001-11504 No disk group name specified in the resource definition... n/a Veritas Cluster Server
    V-16-10031-12511 No disk group name specified in the resource definition... n/a Veritas Cluster Server
     
     
    Note not only is VCS_LOG_CATEGORY different, but the last part of the code is different in all cases and there is no match for HP-ux, so HP-ux may not have this error code, or it might use different wording.
     
    I have had several customers ask for a list of codes and I was never able to get this list.  Clearly this list exists (even if not complete) at https://sort.symantec.com/ecls, but it seems you can only get 100 codes at a time and there is not even a "next" to list the next 100 codes, so I guess there is no way for an end user to get a list of codes held at https://sort.symantec.com/ecls and even if you could it would require a lot of work to correlate all the different error codes as for instance if you search for "No disk group specified" at https://sort.symantec.com/ecls you get:
     
    UMI Message Severity Component
    V-16-10031-1531 No disk group specified. n/a Veritas Cluster Server
    V-16-10011-916 No disk group specified n/a Veritas Cluster Server
     
    So I would guess this means the same thing and if I search for "No disk group" in /opt/VRTSvcs/bin/DiskGroup on VCS 5.1 on Linux then I get:
    # grep "No disk group" *
    monitor:    VCSAG_LOG_MSG "E" "No disk group specified." 1531
     
    So this looks at I ONLY have error code V-16-10031-1531 on VCS 5.1 on Linux so it maybe that you get error code V-16-10031-12511 on another version of VCS on Linux.
     
    Mike

     

  • Hi All,

    The UMI message IDs would be different for VCS bundled agents on various platforms for a given message  as Mike said. The Category ID is different on each platform for bundled agents and for each category ID the message numbers starts from 1. 

    Message format: V-<ProductID>-<CategoryID>-<MessageID>.

    While the UMI would be different on each platform for a message, it is guranteed that the UMI code remains same for a message across VCS releases.

    For example, if a message "xyz" has a UMI V-16-10001-1 in VCS 5.1 release on Solaris, the message UMI for the message would remain as V-16-10001-1 in VCS 6.0 release. If any change made to the message "xyz" say "xyzw" then a different UMI would be used.

     

    Thanks,

    Venkat

     

  • Thank you very much for the description.

    It makes me clear :)

    So any suggestion on what will be best practice while we are designing our custom agents for monitoring the VCS component status ?

    We had opted to monitor the engine logs(By adding filter for error messages/number).

    Or if error message text is common in all, so we can opt the message instead of error number.

    Thanks in advance.

    Best Regards,

    Nimish