cancel
Showing results for 
Search instead for 
Did you mean: 

File servers hanging during backups

moniker131
Level 4

I have two data centers, each with an installation of Backup Exec 20.3 with CASO. Since the last Microsoft updates, one of our file servers in each data center will hang during backups. One server is completely unresponsive and we cannot login to it. The other server we can get into it, but it seems pretty idle, just the shares are unreachable. There have been no changes other than the recent Windows updates. I'm wondering if there is an issue related to that, but haven't seen any other posts with similar issues. Has anyone else ran into this?

5 REPLIES 5

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Does this happen at any time, or only during backups?

We started having this issue with a third file server. It's only during backups. I got paged over the last weekend (not a few days ago, the one before) for the server being down. Someone on site power cycled it, and the backups restarted. It hung again. I put the job on hold, and when they power cycled again it came up and stayed up. I still have the job held and it hasn't crashed since. I need to restart the jobs and monitor, but don't really have time to baby sit a backup job right now.

Backup Exec runs on .NET right? Does the remote agent as well? My gut is a recent .NET update is causing this, sepcifically on 2008R2. We updated 2 of the file servers to 2012 R2 (in place upgrade), and the servers haven't hung since. I could be wrong, but nothign has changed other than Windows updates. 

I think I have a finding. We upgraded the OS from 2008R2 to 2012R2 hoping that would help, and 2008R2 will be retired soon anyway. After we upgraded the OS on one of the servers, the jobs ran fine for about 2 weeks. Or so I thought. It turns out I had missed removing one of the jobs from hold. I re-enabled that job and ran a full backup, and shortly thereafter the shares became unavailalble. I could RDP to the server, and the resources looked fine. BERemote.exe was idle, no cpu usage at all. I tried running some vssadmin commands to list shadows and shadowstorage, but those commands just hung. I ended up rebooting the server.

 

Now that I know this job is the problem, I compared it to another (bigger in terms of data) job that runs against this server. Both of these job backup DFSR shares. However, the one that hung the server had the shares selected instead of the shadow copy components. I don't see why this would cause the server to hang, but it's the difference I saw, so I changed it to the shadow copy components. The backup has been running for about a day now without hanging the shares. The throghput isn't great (disk can't keep up, 100% busy with a queue length spiking over a 100), but at least it's running now. Does this make any sense to you?

Are these Dell Servers R640 or 740?

Two are virtual, one is an HP ProLiant DL380p Gen8