Constant errors 21, 23, and 25
Master + Media servers: Netbackup 7.0.1, Windows Server 2003 R2
Clients: Netbackup 7.0.1 Windows Server 2008-R2 (VMs running IIS)
Getting intermittent failures on 2 clients, both of which are IIS boxes.
Failures are typically 21, 23, 25 (changes each time, and one drive's job might fail with a 21, while another with 23 or 25)
Happens with Full and Cumulative Incremental backups.
Have tried the following:
Rerun the jobs during the day (normally run at night) - Still fails
Created a brand new policy (not copying the old one) - Also fails
Telnet to BPCD port - Successful response from all servers to the clients
Changing the timeout connections - Still fails
Pinging/NSlookup/bptestbpcd from the master/media servers - Everything appears to be fine
Modified connect back under client properties - Still fails
Running bbps while the backups are starting I can see vnetd and bpcd start on the clients, but still it fails with the above listed status codes.
What makes this more confusing is the errors are intermittent. Sometimes one of the clients will back up without issue, or a drive or 2 will backup without issue, while the rest will fail (ie C: will back up, but D: and Shadow Copy Components:\ will fail, while another night D: and C: will backup, but Shadow Copy Components:\ will fail). Other times everything will fail.
The jobs fail more often than they succeed. Reinstalling the software/rebooting the clients doesn't help.
No amount of tweaking settings appears to result in any difference in sucess/failure.
Another odd issue with these clients is when the jobs start up they take a very long time to start attempting to back up data (vs how long it takes job that are not failing).
As well when I try to load host properities on either of the clients it can take upwards to 15 minutes to bring up the information, where as our other clients load up with in a minute or two.
Anyone run in to anything similar that might be able to shed some light on how to resolve this?
Example of one of the failed jobs:
19-Oct-2011 2:55:10 AM - requesting resource sg-mslb-ms02-ms04
19-Oct-2011 2:55:10 AM - requesting resource masterserver.NBU_CLIENT.MAXJOBS.client01
19-Oct-2011 2:55:10 AM - requesting resource masterserver.NBU_POLICY.MAXJOBS.VmwareGuests3
19-Oct-2011 2:55:11 AM - granted resource masterserver.NBU_CLIENT.MAXJOBS.client01
19-Oct-2011 2:55:11 AM - granted resource masterserver.NBU_POLICY.MAXJOBS.VmwareGuests3
19-Oct-2011 2:55:11 AM - granted resource V71611
19-Oct-2011 2:55:11 AM - granted resource Drive001
19-Oct-2011 2:55:11 AM - granted resource mediaserver02-hcart-robot-tld-0
19-Oct-2011 2:55:11 AM - estimated 4059396 Kbytes needed
19-Oct-2011 2:55:26 AM - started process bpbrm (26264)
19-Oct-2011 3:15:19 AM - mounting V71611
19-Oct-2011 3:15:19 AM - mounted; mount time: 00:00:00
19-Oct-2011 3:18:39 AM - positioning V71611 to file 120
19-Oct-2011 3:18:39 AM - positioned V71611; position time: 00:00:00
19-Oct-2011 3:18:40 AM - connecting
19-Oct-2011 3:22:34 AM - Error bpbrm(pid=25228) cannot create data socket, The operation completed successfully. (0)
19-Oct-2011 3:31:48 AM - mounted
19-Oct-2011 3:35:10 AM - positioning V71611 to file 120
19-Oct-2011 3:35:10 AM - positioned V71611; position time: 00:00:00
19-Oct-2011 3:48:25 AM - mounted
19-Oct-2011 3:48:25 AM - positioning V71611 to file 120
19-Oct-2011 3:55:00 AM - positioned V71611; position time: 00:06:35
19-Oct-2011 4:11:34 AM - mounted
19-Oct-2011 4:11:34 AM - positioning V71611 to file 120
19-Oct-2011 4:11:34 AM - positioned V71611; position time: 00:00:00
19-Oct-2011 4:11:34 AM - end writing
cannot connect on socket(25)
- You should enable debug logging on both servers and clients. For detail, check Troubleshooting Guide. In addition, check following points. * DNS is stable * UAC is disabled * firewall is disabled, or port 1556 is open