Forum Discussion

Alberto_Colombo's avatar
12 years ago

SPA Authorization Service NOT Starting

 

Hi,

the Puredisk SPA Authorization Service is not starting on N5020, so it's impossible to use N5020.

 

Last Night at 4.00AM Netbackup has lost contact with N5020, the error was 213.

trying to update Storage Server credential now returns this error:

RDSM has encountered an STS error: getStorageServerInfo

 

on N5020, we tried to stop Puredisk services (all of them were listed as running), with no success and with one specific error:

 

Starting Puredisk SPA Authorization Service: java /opt/pdspa/tomcat/bin failed

 

i can also see a lot of "webservice call" errors in N5020 logs

i also can see a lot of execfail*** under /tmp, like this one:

 

cat execfailDF2SPu
Unable to run operation, please see troubleshooting logs for more error information\nExitcode was 1
array (
)Command was /opt/pdctrl/bin/pdmanage --agent 496000000 --server nba-5020.client.it  --wait --command "/opt/pdag/bin/php /opt/pdmbe/cli/MBEDUSummary.php" 2>/Storage/tmp/STDEPDMQh98zbError output is Sun Nov  4 2012 12:30:01.876879 ERROR (47814416437568): A remote error occurred: Authentication failed. (1)
nba-5020:/tmp #
 
and, under /Storage/tmp:
 
nba-5020:/Storage/tmp # cat STDEPDMQh98zb
Sun Nov  4 2012 12:30:01.876879 ERROR (47814416437568): A remote error occurred: Authentication failed. (1)
nba-5020:/Storage/tmp #

moreover, N5020 appliance web console is unreachable

i've tried to reboot N5020 2 times, but the result is just the same a lot of puredisk services are not running

can you please help us?

thank you

  • does sound like a disk full condition - check disk space and any logging set up to clear down old log files and free up disk space - once there is plenty of disk space you should be able to re-start the services or reboot successfully but let us know if that is the case first

    Also check that the queue processing has been running to keep things clean - if the queue has blocked up it will stop the appliance from working.

    Path :   /opt/pdcr/bin/

     To check the disk pool status     crcontrol --dsstat  

    To check when was the last successful CRQP ran :   crcontrol --queueinfo

    To check whether CRQP is in progress :   crcontrol --processqueueinfo

    To start CRQP manually:       crcontrol --processqueue

    If the processqueueinfo / queueinfo shows it has not run for a long time and the queue is large then kick it off with the processqueue and it will bring things back to like

    If that was the case you need to know why (the 1.4.1 patch removed the cron job that runs this every 12 hours so it needs manually adding back in or patching to 1.4.2)

    Hope this helps

  • May be a disk full condition?

    I had this problem some time ago.

    Pleas try:

    df -h

    from the root prompt.