cancel
Showing results for 
Search instead for 
Did you mean: 

After upgraded to Puredisk 6.6.4 backups stuck at 52 percent complete

Moab_baom
Not applicable

Hi,

I upgraded my Puredisk server from 6.6.3 to 6.6.4 because of some backups in error, nut now all my backup stuck after 52 percent complete?

Job steps : "Import PO-List on the MetaBase Engine"

The end of the job log

 *** Supportability Summary ***
jobid                 = 20172
jobstepid             = 81567
agentid               = 27
hostname              = SMXP002
starttimejobstep      = 2012-Dec-26 11:09:32 CET
endtimejobstep        = 2012-Dec-26 11:10:05 CET
workflowstepname      = Data Backup
status                = SUCCESS


[2012-Dec-26 11:10:06 CET] *** Start: MBImport ***

???

6 REPLIES 6

ikonseeu
Not applicable

Hi,

I am having the exact same issue, ever since we upgraded from 6.6.3 to 6.6.4 our backups are getting stuck at 52%. 

The CR Queue Processing and Data Mining policies all seem to take hours now, instead of minutes.

 

Any Help would be appreciated.

f25
Level 4

Hi,

52% means problems with MBE import

1. Please check MetaBase Engine service.

Should work for PureDisk:
# /etc/init.d/puredisk status

Start if failed.

2. Check for any jobs of kind (personally, I don't know if those exist at all on 6.6.4):

MetaBase Garbage collection
Data Removal
Import Replicated Data Selection (only for PDRoE)

They may cause deadlocks. Consider stopping them gracefully and rerunning manually when backups are done.

3. Contact Symantec Support and share the result here :)

Thanks and good-luck!

gdwfkd
Level 2

I just had a similar thing happen.  Some details:

I built a two-node SPA from the 6.6.3a media kit, then upgraded to 6.6.4 and applied the latest patch prior to adding any clients to the SPA.  The SPA had been running fine for about 2 months.  Current client load is about 100.  We will grow this to about 400. 

A couple of days ago, I noticed that about 25 jobs were stuck at 52% and holding at a Running_Hold state.  Some other backups jobs were successful, so not all backup jobs were affected.  I attempted to restart one of the Running_Hold jobs but it got to 52% and went Running_Hold again.

Opened a case with support.  After I provided them with some requested log files, they sent me the following link:

http://www.symantec.com/business/support/index?page=content&id=TECH50590

I never attempted the procedure outlined in the KB artilce because I decided to restart the puredisk services before I heard back from support.  Note that the support tech I was working with was very responsive, but I had a window of time where I could restart the server without production impact, so I took the opportunity.

I attempted to restart the services via "puredisk service stop", but the metabase engine (MBE) wouldn't stop.  In lieue of killing that process, I rebooted the SPA node and things came up fine.

I've asked Symantec to review the log files to see if there's some sort of bug.  I also told them about this forum post to see if anyone submitted a case for this and it has already been fixed.

I haven't seen this issue again, but it has only been two days since it happened.

 

Brook_Humphrey
Level 4
Employee Accredited Certified

Just curious did you reboot the PureDisk server after the upgrade?

gdwfkd
Level 2

I'm pretty sure I had rebooted after the upgrade.  Again, I went from 6.6.3a media kit -> 6.6.4 +patch before I even added a client.  I'm guessing I probably did reboot after the build was finished because the installation instructions tell you not to reboot until you upgrade to 6.6.4 or your storage will disappear (known issue in the release notes). 

There's also an issue with the Content Router segfaulting at installation if the box has too much RAM, so I've taken to installing only 16 or 32GB of RAM, completing the build and making the necessary config changes for this issue, then adding the RAM back, which would have required a reboot.

This SPA has been backing up clients for probably about 2 months or more now.  The issue hasn't happened again since I posted, but it's only been a week or two.

 

f25
Level 4

Hi,

Did you notice any other jobs failed due to restart of the SPA? My previous post, point 2.

Anybody trying 6.6.5?

Cheers!