cancel
Showing results for 
Search instead for 
Did you mean: 

Windows 2003 client backing up to 6.5.2 Solaris 10 master server randomly hangs during backup, yet does not time out and fail

rjrumfelt
Level 6
I did a search and could not find any threads that seem relevant to this specific issue.  

We have a 2K3 client backing up to a 6.5.2 Solaris master server, and it hangs up more than it successfully backs up.  We are backing up the c, system state, and several other locally attached drives.  We have it currently set to the "All Local Drives" directive,   After playing around with it for a while, we were able to get successful backups backing up each drive individually, however I do not think this actually has anything to do with a fix, I think we were just lucky.

We've checked the bpbrm logs on the master, and the bpbkar logs on the client.  At the point where the backup hangs, we see the following message repeated until we kill the backup in the bpbkar log on the client (I do not currently have access to the client, so I cut and pasted the logs from one of the technotes and modified them for my situation):

5:33:26.637 PM: [4988.5860] <2> dtcp_write: TCP - success: send socket (1828),  1 of 1 bytes
5:33:26.637 PM: [4988.4604] <2> dtcp_read: TCP - success: recv socket (520), 4  of 4 bytes
5:33:26.668 PM: [4988.4604] <2> dtcp_read: TCP - success: recv socket (520), 4 of 4 bytes
5:34:26.637 PM: [4988.5860] <2> dtcp_write: TCP - success: send socket (1828),  1 of 1 bytes

When checking the bpbrm logs on the master, the logs stop updating the minute we start seeing the messages above.

Now I found a technote that seems very similar to this situation, the link to the technote pasted below:

http://seer.entsupport.symantec.com/docs/325624.htm

In the technote however, it appears it is discussing interactions between a Windows master and a Windows client., whereas we have a Solaris master.

Does anyone have any suggestions as to what might be causing the backups to hang?   Can the bug described in the above technote also be applied to a Solaris Master/WIndows client environment?

We have also verified that VSS is enabled, and that we can manually create snapshots using both WIndows commands and NBU commands (bpfis) and there were no issues. 
1 ACCEPTED SOLUTION

Accepted Solutions

CRZ
Level 6
Employee Accredited Certified
I don't have specifics, but I do know you can request an EEB replacement for bpbkar32 from us which will fix the issue if you're experiencing it.

Note that the TechNote says the fix won't be available until 6.5.6 (not 6.5.5), so if and when you DO upgrade, you will probably have to ask us for another EEB for your new version if it's below 6.5.6.

View solution in original post

8 REPLIES 8

Android
Level 6
Partner Accredited Certified
In reading the tech note it would seem to me that it does in fact apply.  The technote states it relates to Windows 32-bit and 64-bit clients communicating with media servers, not Windows media servers (while that is the case in their example, I don't think it applies strictly to Windows media servers). 

It goes on to state it effects all versions of the product including 6.5.4 and that the fix will be available inthe next product release 6.5.5.

rjrumfelt
Level 6
Below is the actual log file I was able to obtain - it is slightly different from what I had pasted earlier.

8:57:57.915 PM: [5696.5636] <2> dtcp_write: TCP - success: send socket (840), 172 of 172 bytes

8:57:57.915 PM: [5696.5636] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: Fil - 5873 34670 10782435 -1 85 33216 root root 5693 1

251322523 1251322481 1251322523 /C/WINDOWS/file1

8:57:57.915 PM: [5696.5636] <2> tar_backup_tfi::backup_startfile_state: TAR - Backup: C:\WINDOWS\file2

8:58:21.930 PM: [5696.5768] <2> dtcp_read: TCP - success: recv socket (568), 4 of 4 bytes

8:58:21.930 PM: [5696.5768] <4> bpio::read_string: INF - read non-blocking message of length 1

8:58:21.930 PM: [5696.5768] <2> dtcp_read: TCP - success: recv socket (568), 1 of 1 bytes

8:58:21.930 PM: [5696.5768] <4> tar_backup::readServerMessage: INF - keepalive message received

8:58:21.930 PM: [5696.5768] <4> tar_base::keepaliveThread: INF - sending keepalive

8:58:21.930 PM: [5696.5768] <2> dtcp_write: TCP - success: send socket (840), 1 of 1 bytes

8:59:21.930 PM: [5696.5768] <2> dtcp_read: TCP - success: recv socket (568), 4 of 4 bytes

8:59:21.930 PM: [5696.5768] <4> bpio::read_string: INF - read non-blocking message of length 1

8:59:21.930 PM: [5696.5768] <2> dtcp_read: TCP - success: recv socket (568), 1 of 1 bytes

8:59:21.930 PM: [5696.5768] <4> tar_backup::readServerMessage: INF - keepalive message received

8:59:21.930 PM: [5696.5768] <4> tar_base::keepaliveThread: INF - sending keepalive

8:59:21.930 PM: [5696.5768] <2> dtcp_write: TCP - success: send socket (840), 1 of 1 bytes

9:00:21.930 PM: [5696.5768] <2> dtcp_read: TCP - success: recv socket (568), 4 of 4 bytes

9:00:21.930 PM: [5696.5768] <4> bpio::read_string: INF - read non-blocking message of length 1

rjrumfelt
Level 6
If anyone else out there has come across this issue, I'm wondering what, if anything has been done to fix it.  This behavior seems rather random, meaning that there is not something specific to a client that causes this.  The client in question was backing up just fine before it was reimaged, and we've checked everything we could possibly check that could cause issues.

I'm wondering if an uninstall/reinstall of the NBU agent might solve this problem.

CRZ
Level 6
Employee Accredited Certified
I don't have specifics, but I do know you can request an EEB replacement for bpbkar32 from us which will fix the issue if you're experiencing it.

Note that the TechNote says the fix won't be available until 6.5.6 (not 6.5.5), so if and when you DO upgrade, you will probably have to ask us for another EEB for your new version if it's below 6.5.6.

rjrumfelt
Level 6
Would it be possible to request the EEB replacement through this forum?  Or do I need to open a request through Symantec Support?

CRZ
Level 6
Employee Accredited Certified
For this one, you'll need to open a Support case as it isn't externally available.

rjrumfelt
Level 6
I've got a support case open, and should be gettng the file I need sometime in the near future.  They said they did not have a v6.5.2a version readily available.

rjrumfelt
Level 6
to install the EEB yet, but I was wondering if there might be anything else out there causing these backups to hang.

This is the oddest thing.  There's no rhyme or reason to when the backup just stops writing data.  I've tried to see if it is a particular file that it hangs on, and it isnt.  I've tried to see if there's a certain amount of data that is pushed before it hangs, but there isn't. 

Does anyone one know of any other problems that can cause a backup to hang after writing a random amount of data?  On average the most I'm able to write is about 800 MB before it hangs.  I've also let it run over the weekend, but it just does not error out.