cancel
Showing results for 
Search instead for 
Did you mean: 

Creating Recovery Point gets stuck at 1%

Aatif
Level 2

Hi,

 

I am using BESR 8, and every day it gets stuck at 1% when it is creating Recovery Point. It stays for hours after that, and there is continuous network activity (network lights are continuosly on) until I cancel it from Progress and Performance tab.

 

Is there any solution to this?

 

Thanks,

Aatif 

19 REPLIES 19

Joseph_L_
Level 5
Employee

If its getting stuck before the first 5% this may be symptom of issues with VSS.

 

 http://support.veritas.com/docs/293558

BTX
Level 2

I have been using BESR for a few years with prior versions and continue to have issues with a stable update.  What has been happening in every version is that Symantec release a patch, update, hot fix or whatever you want to call them, and they affect something else.  I would disable the liveupdate but, most updates address other issues which are needed. (virus signatures, ...)

Anyway BESR...  What I noticed is that when you start the backup process Run Backup Now BESR starts and stops the Virtual Disk service and than immediately starts the Volume Shadow Copy (vssvc.exe) service.  Once the vssvc is started, BESR hangs (or at least does not process the image).  If I try to stop the vssvc service, the service will not stop, however, if I end the service via Task Manager, the BESR process will resume.  Now, if you do not restart the VSSVC service, the BESR process may fail creating certain images which have Exchange, MSSQL... running.  So, immediately restart the VSSVC service and BESR runs and completes fine but, I have to do this manually.

Symantec, look at this process and see if the order or timing is an issue.

Now the history of my server (everything is the lastest SP/Updates...) Windows Server 2003 Standard SP2, all eggs in one basket... Exchange 2003, MS SQL Server 2005 Express (many instances), Symantec Endpoint Protection with Mail Security and a bunch of other application which all indicate they should not run on Exchange server or MS SQL server (but they do).  The system runs excellent, no issues, until updates which require reboots.  The only outstanding and unresolved issue is BESR 8.5. 

Thanks,

BTX

Joseph_L_
Level 5
Employee

First step to assuring there are no issues with VSS, is to run VSSADMIN LIST WRITERS, withing a CMD prompt.

IF there is a problem MS should show writers in failures, this assuming you have not rebooted.

If no writers appear to be in failure, and still have issues during the snapshot phase, please install MS update 940349:

 

http://support.microsoft.com/kb/940349

 

 

Aatif
Level 2
Hi,

Sorry for late reply, I tried VSSADMIN LIST WRITERS, and there is no error in any of the writers. Also tried installing MS update 940349, didn't make any difference.

I've tried stopping Volume Shadow service and it stopped without any problem. Also tried stopping SymSnapService (Symantec Volume Snapshot Service), it gave an error that service could not be stopped but when I open it, it's status is stopped.

That didn't make any difference either, it just remains at 1% although there is very high network activity (I am trying to take backup it to a network drive). All other computers take around half an hour to complete (as they are scheduled to run a daibly backup).

Any other idea?

Thanks,
Aatif

GregC2
Level 4
I am running the 8.5.2 client on a Windows 2003 x32 R2 SP2 system.
I am also running the 8.5.3 on a Vista x32 SP1 system

Both of these systems worked great with BESR, the Win 2003 Server had for several years (Starting with Livestate 6.5).
The Vista machine was recently rebuilt due to another issue, ever since Liveupdate BESR has failed since 3 days ago.

I have not tried an uninstall just yet.

Both are stuck at 1% forever.

Russ_Clark
Level 3
I get up tp 5% while trying to backup a Windows 2008 cluster node running hyper-v.  I ran the vssadmin command above and everything comes back with no error.  When I move all the cluster services off the particular node, then system recovery finishes just fine, but not with clustered resources.  I used the same BESR version on a Server 2003 cluster running MS Virtual server with no problems in the past, but when we recently upgraded I can't get backup job to finish without first moving all the cluster resources off the node being backed up.

GregC2
Level 4
I guess we can collectively wait till the year 2026 before Symantecs wonderful outsourced tech support replies here to help out. Last time I found a major memory leak for Symantec, they DEMANDED I pay for support in order to report a fatal flaw I found in their product. It wasn't until I used an old direct email link to one of their U.S. based tech support teams (Tier 3?) that he took noticed and actually agreed with the findings and appreciated the help (which was patched 60 days later mind you).

Yup, good old Symantec. Too big to be effective or trusted these days and all about the money.

Bravo Symantec. Way to take your previous "Rolls Royce" reputation and turn it into "Made in Mexico" in less that 7 years.

marcogsp
Level 6
I've worked with this product ever since version 1.0 and PowerQuest owned the product, so I remember the days of backing up systems with databases before VSS was widely used.  Of course this meant stopping the appropriate  services with a preimaging command file before taking the snapshot, and then running another command file after the snapshot or imaging process in order to restart the services.  This was by no means as efficient as VSS, but it did put the SQL databases and mailstores in a consistent state to be backed up.  As long as the services were set to automatic startup, the recovered server never balked in my experience.  Up to about two years ago, I was still  backing up my Exchange 2003 servers in this manner, because I just didn't trust this new fangled VSS thing.  I now leave the Exchange services running and let VSS do its thing

The one drawback I found to stopping the Exchange services and imaging, was that it rendered the Exchange Retrieve option useless.  I'm not sure if this would be the case with the Granular Restore Option. 

If any of you folks are in the same boat as I am, then you have to report on the organizations DR capabilites to the powers that be.  Stopping the affected mail and database services for DR imaging is not the ideal solution.  However, it is better than telling your bosses, "I can't do a DR backup of this server because an advertised feature is not working correctly with our setup.  Regardless of who is at fault (Symantec, Microsoft, us.......) my boss wouldn't care if the DR image was made with the advertised feature or a work around.  Once the DR image is made, there would be plenty of time to place blame and either reconfigure the server or replace the DR solution with something that works better

BTX
Level 2

Since my last comment above I have been able to get some successful images on all volumes.  I was still have issues with a few failures pointing to VSS and some unknown.  However, in my last few days, I have had NO successful images.  I have a semi large partitiion (about 1TB) and hate watching it fail at 100% so, I have a tiny partition for testing the initial image process which is only a few MBs but it hangs consistantly at various percents, with error messages or events.  (Not sure if there are any SECRET messages).  I can only stop the BESR service to end the hung process.  All VSS Writers show Stable - No error.

I can successfully backup by file with BESR but, that is not fun when you have TBs of data.  The thousand of backup files looks ugly and... yes, I tried a restore of some file and they look worse. 

At this point we have no support since the product did not function successfully during that time and probably wont renew.

I did see a post regarding a cold image, which I will have to try since we have nothing...  Our Backup Exec software was removed since that did not work either and the Tape drive was max'd.

Anyone have any solutions or hints to research the image process when it hangs?  Also, is there a way to stop the LiveUpdates from updating BESR only?  Since, if I ever get the software to function again I would like to use it... Also, may need to stop MS Updates too.

Thanks and good luck to all!

BTX

GregC2
Level 4
We all have given up hope here for Symantec "Tech Support" helping out. Notice they never reply?

Couple things I noticed for myself which may help.

If you use the pathetically poor Symantec Corp AV, such as Endpoint, this will update itself via Liveupdate and will update your BESR as well on that machine. That caused a ton of problems for me! It would notice that I have the BESR Manager install package which I have deployed but it would also install the latest BESR Client, thus making two installs and breaking both!

We bailed Symantec AV completely and since then all systems with the BESR Manager Client install have worked great.

The other thing is that I had one machine continuously fail within a week. The service would just stop. I created a batch file script to run 30 minutes prior to backup, problem solved. Here is the batch file:

net stop "Backup Exec System Recovery"
net stop "Volume Shadow Copy"
net stop "Microsoft Software Shadow Copy Provider"
taskkill /IM VProSvc.exe /F
taskkill /IM vprotray.exe /F

net start "Backup Exec System Recovery"
net start "Volume Shadow Copy"
net start "Microsoft Software Shadow Copy Provider"

Somtimes I also add the Symantec VSS start stop too but that didn't seem to make a difference at all. The others above did. You may just want to start with this first:

net stop "Backup Exec System Recovery"
taskkill /IM VProSvc.exe /F
taskkill /IM vprotray.exe /F

net start "Backup Exec System Recovery"


Aatif
Level 2
Thanks Greg, I will try the batch file thing today. I am not using Symantec Corp AV, so don't think that has caused the problem. Windows update might have done something.

Also I've got Sharepoint services installed on the machine, it has a service called "Windows SharePoint Services VSS Writer", will include that in the batch file as well.

BTX
Level 2

My recent backup failures where very difficult to trace since there were no event messages or errors.  The backup process would randomly fail at any percent.  But, recently I noticed that the process appeared to complete but not successfully terminate the backup cycle.  (never ended with a Succeeded message).

Here is my clue...

http://support.veritas.com/docs/307406

The link above directed my to the correct solution, however, the problem I had was not corrupt files, but, somehow, my server now has compression enabled and is the cause for ALL of my new issues.  Not sure how the option was enabled, but it is.  The compression of files on applications like BESR and MS SQL is not a good thing.  I selected the all the directories listed in the doc above and removed compression and wahooo!

Thanks for all the help out there!

BTX

JWK
Level 4
Have you tried any of these articles?

http://seer.entsupport.symantec.com/docs/294502.htm
http://seer.entsupport.symantec.com/docs/308047.htm
http://seer.entsupport.symantec.com/docs/294386.htm
http://seer.entsupport.symantec.com/docs/316174.htm

marcogsp
Level 6
Thanks for posting these JWK.  I've been trying to build a list of  "stalled imaging process" junctures and their probable causes, and this really helps.  This also helps reinforce some of the things that I think have helped the server imaging process succeed more than fail in my environment.

Running chkdsk with a minimum of the /f option at least once per month to detect problems with the imaged drives before they get too serious.

Insuring that there is plenty of contiguous space to write the images too.  Both the imaged drives and the image storage drives are defragmented automatically with a commercial grade deframentation program.  I also either disable or schedule the degramentation program to not run during imaging or backup processes  Also, any drive/array  that is approaching less than 10 to 15 percent of free space is  targeted for resizing or replacement..

No shadow copies of the drives that images are stored on.

The images are written to a shared drive that exists on a local array of the server that holds the backup images.  The images are later offloaded via scheduled script to NAS and external USB drives for safekeeping and off site storage.  Usually, only two weeks worth of backups are stored on the local array in order to keep enough space available.

Using good quality network switches, NICs and cabling.  The servers are on a Gigabit switch while the ordinary users are on 10/100 switches.  This insures that the servers get the bandwidth they need for imaging and backup processes.

Imaging jobs are spaced out so that no more than two servers are imaging at one time.  I only have six servers and two of them image in under 10 minutes. There is plenty of time in my backup window to get it all done without rushing it.  I could attempt more than two server imaging jobs, but there  is no rush in my environment.

No Adaptec HostRAID Seral ATA HBAs in my environment, so no need to have the latest aarich.sys driver.  However I do assure that the storage drivers I use are adequate for the task.  I don't update drivers unless it is absolutely necessary.  If it ain't broke, don't fix it.

 


GregC2
Level 4
I completely failed to mention this to your folks and this may or may not help.

I had several clients over the years that had problems with some servers using BESR where their image would never Verify and thus be deleted at the end of the backup. I tried all the stuff here that had been mentioned years ago, chkdsk, network throttling etc.

Turned out one was a bad switch (Replaced the switch from a 10/100 to 100/1000 problem solved)
The other was the network card itself. I used the secondary nic and wa la, the backups were fine again.

So this may or may not help those with the 5% issue, but it might be something to consider especially if you have a spare nic or simple switch laying around. Just a thought!


Aatif
Level 2

As my problem is that it gets stuck at 1%, so I assume that it is network problem, although all other computers on the same network don't have any problem. I will try to use external disk in few days and will update this thread here.

Thanks all for your help.

Aatif

P.S.
Greg,
That batch file didn't work for me.


marcogsp
Level 6
Aatif -- The possible networking problem in your environment could be a chatty network card/interface or a faulty port on the switch the server is attached to.  The management interface of the switch should be able to help you monitor extremes in activity on that particular port.  If the port is overactive or reporting alot of errors, and switching the connection produces the same results, then the NIC is most likely the cause. If switching ports clears up the issue, then the original port on the switch is most likely to blame.  This of course assumes that the cabling is in good working order.  It would be a good idea to swap out cables too to see if that helps.

GregC2
Level 4
Yes, but lets not take this subject off course. Your statements are true, but we did not have high end, expensive switches. Either it was the port, the switch, or the nic, but replacing the nic was easy and worked for us. It could have been a setting on the Nic but looking at both settings (the old nic is still in the system) are the same.

Thanks for the tips though.

JWK
Level 4
Here is another one that I find unacceptable
http://seer.entsupport.symantec.com/docs/301838.htm

How can they not provide an explanation for this or provide something to say they are working on fixing this? Instead they provide a manual workaround for a product that should be fully automated.