I just had a similar thing happen. Some details:
I built a two-node SPA from the 6.6.3a media kit, then upgraded to 6.6.4 and applied the latest patch prior to adding any clients to the SPA. The SPA had been running fine for about 2 months. Current client load is about 100. We will grow this to about 400.
A couple of days ago, I noticed that about 25 jobs were stuck at 52% and holding at a Running_Hold state. Some other backups jobs were successful, so not all backup jobs were affected. I attempted to restart one of the Running_Hold jobs but it got to 52% and went Running_Hold again.
Opened a case with support. After I provided them with some requested log files, they sent me the following link:
http://www.symantec.com/business/support/index?page=content&id=TECH50590
I never attempted the procedure outlined in the KB artilce because I decided to restart the puredisk services before I heard back from support. Note that the support tech I was working with was very responsive, but I had a window of time where I could restart the server without production impact, so I took the opportunity.
I attempted to restart the services via "puredisk service stop", but the metabase engine (MBE) wouldn't stop. In lieue of killing that process, I rebooted the SPA node and things came up fine.
I've asked Symantec to review the log files to see if there's some sort of bug. I also told them about this forum post to see if anyone submitted a case for this and it has already been fixed.
I haven't seen this issue again, but it has only been two days since it happened.