06-12-2014 04:29 AM
Hi All,
We recently upgraded on of our appliance from 2.5.3 to 2.6.0.2. It was a direct upgrade to 2.6.0.2 version. Since the upgrade backups are failing with error 24 and 14 mainly socket errors and we can see winsock error 10060 in bpbkar.
The interesting thing is if we move the backups to the appliance which is not yet upgraded to 2.6.0.2 and is still at 2.5.3 level, the backup goes on fine.
We have opened a case with Symantec and are still struggling for more than a week.
Anyone of you had similar issues after upgrade. Kindly share and advise.
Solved! Go to Solution.
08-10-2014 05:13 AM
This was tried and did not help. The issue is persistent whether WAN optimization is ON or OFF.
Another update is upgrade to 2.6.0.3 did not help and the issue is still there.
08-10-2014 05:30 AM
Same here
08-11-2014 11:59 AM
08-12-2014 06:20 PM
We have been chasing this issue around for over a week. Same here we were told it was out network, ran appcritical test etc. Still no fix. Backups going to same appliance to basic disk do not experience the slow speeds or socket errors. When we go to Puredisk the issue starts right away. Still have a ticket opened with Symantec but so far no fix. Has anyone figured out a fix on this?
08-13-2014 01:24 AM
Hi. There is an ETRACK created for this. I will let you guys know once it is out.
08-14-2014 03:30 PM
Our status 24's started on a freshly installed 5320 (2.6.0.2) without any upgrade or catalog import... just taking over for the old Windows master. When we flipped back to the Windows master (7.5.0.4), there were no issues at all. We did this several times and it is consistent.
Our original client version was 6.5.x, then in the course of t-shooting, they were upgraded to 7.6.0.2, which had no effect on the issue.
08-21-2014 10:49 AM
Hi all,
If you have a case opened with support with regard to this issue could you please pass on the number so I can look into it?
Thanks,
frank.
08-22-2014 04:52 AM
same problem here.
Updated to 2.6.0.3 on a 5230. My exchange backups end incomplete with 42 and 24 errors.
24 on the backup, 42 on the snapshot. Already tried increasing timeouts and disabling WAN optimization. No effect.I am reverting back to backing up to virtual-tape to have at least disaster copy (no granular possible)
Will make a call aswell cause seems to be a real problem
Regards,
Heino
08-26-2014 09:20 AM
We have seen customers who are experiencing this issue as well. In several of our cases we made the changes below and the issue immediately went away.
Symptoms:
Network related errors/status code 14, 24, 40, and/or 42 are occuring in large numbers on Symantec NetBackup Appliances running 2.6.0.1, 2.6.0.2, or 2.6.0.3 code.
Cause:
Symantec attempted to tune the TCP stack on the appliances in the 2.6.0.x and one of the settings changed, the net.ipv4.tcp_timestamps kernel setting, is resulting in these errors. In prior versions of the appliances, 2.5.x, this setting had been a value of 1 but in the 2.6.0.x code this was changed to 0 (zero).
Resolution:
1. Using Putty, another SSH utility, or via the IPMI/remote console connection, log into the appliance as the admin user and become root in the maintenance mode.
A. Go to the Main Menu->Support->Maintenance option and enter the maintenance password.
B. run "/opt/Symantec/scspagent/IPS/sisipsoverride.sh" to temporarily override the security policy. This should take the same maintenance password as entered above.
C. type "elevate" to become root
2. Confirm the current setting is equal to zero by running
sysctl -a | grep tcp_timestamps
3. As the root user, make a backup copy of the file to edit:
cp "/etc/sysctl.conf /etc/sysctl.conf.tcptimestampsoff"
4. Then if comfortable using vi to edit files run "vi /etc/sysctl.conf" and change the line that reads:
net.ipv4.tcp_timestamps = 0
To instead read:
net.ipv4.tcp_timestamps = 1
Save the file using either ESC + :wq or ESC + Z + Z *without entering the plus sign
Else run from the command line
sed -i /etc/sysctl.conf -e 's/^net\.ipv4\.tcp_timestamps =.*/net\.ipv4\.tcp_timestamps = 1/'
5. Verify the tcp_timestamps now equals 1 in the file by running
grep timestamps /etc/sysctl.conf
6. Run "sysctl -p" at the command line.
Note: This change does not require a reboot.
7. Confirm the setting change by running the below again
sysctl -a | grep tcp_timestamps
Note: If after making the above change there are still jobs that end with a status 14 or 24 make sure the TCP Chimney setting is disable on the clients experiencing the errors. This can be done on Windows 2008 and greater servers by running "netsh int tcp set global chimney=disabled" from an elevated command prompt; this also does not require a reboot. There can also be other reasons a client gets a status 24 error for example because of firewalls or virus scanners but this KB article does not cover those troubleshooting steps.
08-26-2014 11:42 AM
I have seen this before. Here are the steps that should help.
Symptoms:
Network related errors/status code 14, 24, 40, and/or 42 are occuring in large numbers on Symantec NetBackup Appliances running 2.6.0.1, 2.6.0.2, or 2.6.0.3 code.
Cause:
Symantec attempted to tune the TCP stack on the appliances in the 2.6.0.x and one of the settings changed, the net.ipv4.tcp_timestamps kernel setting, is resulting in these errors. In prior versions of the appliances, 2.5.x, this setting had been a value of 1 but in the 2.6.0.x code this was changed to 0 (zero).
Resolution:
1. Using Putty, another SSH utility, or via the IPMI/remote console connection, log into the appliance as the admin user and become root in the maintenance mode.
A. Go to the Main Menu->Support->Maintenance option and enter the maintenance password.
B. run "/opt/Symantec/scspagent/IPS/sisipsoverride.sh" to temporarily override the security policy. This should take the same maintenance password as entered above.
C. type "elevate" to become root
2. Confirm the current setting is equal to zero by running
sysctl -a | grep tcp_timestamps
3. As the root user, make a backup copy of the file to edit:
cp "/etc/sysctl.conf /etc/sysctl.conf.tcptimestampsoff"
4. Then if comfortable using vi to edit files run "vi /etc/sysctl.conf" and change the line that reads:
net.ipv4.tcp_timestamps = 0
To instead read:
net.ipv4.tcp_timestamps = 1
Save the file using either ESC + :wq or ESC + Z + Z *without entering the plus sign
Else run from the command line
sed -i /etc/sysctl.conf -e 's/^net\.ipv4\.tcp_timestamps =.*/net\.ipv4\.tcp_timestamps = 1/'
5. Verify the tcp_timestamps now equals 1 in the file by running
grep timestamps /etc/sysctl.conf
6. Run "sysctl -p" at the command line.
Note: This change does not require a reboot.
7. Confirm the setting change by running the below again
sysctl -a | grep tcp_timestamps
Note: If after making the above change there are still jobs that end with a status 14 or 24 make sure the TCP Chimney setting is disable on the clients experiencing the errors. This can be done on Windows 2008 and greater servers by running "netsh int tcp set global chimney=disabled" from an elevated command prompt; this also does not require a reboot. There can also be other reasons a client gets a status 24 error for example because of firewalls or virus scanners but this KB article does not cover those troubleshooting steps.
08-27-2014 01:28 AM
ChetteB .. very useful information - thanks
08-28-2014 05:53 AM
Thanks ChetteB we just received this solution from backline as well and can confirm it works. We have two issues going on first the socket errors that was resolved and second when we do tapeouts(disk to tape for monthend needs) the backup performance drops to a point where no backups can run as their speeds are so slow. I'm talking a huge difference-50,000kb per second down to 200kb per second on one of our appliances that houses windows ms-standard backups and vmware snapshot backups. Symantec advised to limit the hours when we run duplication jobs and limit concurrent write drives however the problem is we have so much to duplicate with a small window and using 2-4 tape drives it will never finish. Symantec agreed but unfortunately they cannot do anymore, the duplication jobs have a direct huge performance hit on the backups. We are strongly considered backing up to advanced disk then slp to msdp/tape just like kwachtler mentioned above or else we won't be able to fulfill or month-end requirements. When we backup directly to advanced disk with the duplication jobs running we see no performance hit(no inline dup). It's just very disappointing we refreshed our entire netbackup environment counting on these appliances and have spent a lot of money and the performance of these appliances have been very disappointing. I never thought i would dream of the day when we had our falconstor vtl back...
09-02-2014 11:07 AM
I am having the same issue. We are getting not only the status 24 and 14 errors, but also 13 and 87. I made the changes that ChetteB mentions above, including disabling the TCP Chimney, but did not fix the issue for us. I also made the following changes per several documents I found:
- Per TID TECH222380 (also mentioned above), disabled WANOptimization on the appliance under "Network-WANOptimization" using the CLISH. ***Did not help.
- Per TID TECH206337, changed the setting in the NET_BUFFER_SZ file on the appliance to "0". This disables auto-tuning for TCP stacks. ***Did not help.
- Changed the "Client Read Timeout" and "File Browse Timeout" from 300 to 2400 on the media servers (appliances) under "Host Properties-Media Servers-MediaServerName-Timeouts". ***Did not help.
- Per TID TECH55653, ran the following commands to disable other TCP Chimney parameters: "netsh int tcp set global rss=disabled" and "netsh int tcp set global netdma=disabled". The backup finished successfully, although it ran much slower than normal.
I will make the same changes to a few other clients and see what happens. Any thoughts?
Thanks,
Larry
09-04-2014 09:49 PM
Hello,
I am experiencing similar issue that windows FS backups are failing with SC 14
I tried the below solutions but no luck
1) changed the setting in the NET_BUFFER_SZ file on the appliance to "0".
2) Increasing the time outs.
3) Solution given by ChetteB
Please help me to fix the issue. Also, I would like to know how to download ETRACK for this.
Thanks
Pavan
09-08-2014 12:30 PM
If the above fix does not work I have found these steps to also help
Symptoms:
Recommended for networking/socket errors. I.E. status 24's, 23's, 14's, 40's, etc...
Cause:
No network is perfect the following steps can assist in making things a bit more resilient to minor socket issues.
Resolution:
TCP Timed Wait Delay
TCP Mad Data Retransmissions
TCP Max Connect Retransmissions
Keep Alive Time
5. Quit Registry Editor and reboot for the changes to take effect.
3. Next look to see the state of RSS, chimney and autotuning: (they all should be disabled)
For windows 2008 and above:
netsh int tcp show global
To disable:
For 2003:
netsh int ip set chimney DISABLED
How to disable receive-side scaling in Windows Server 2003
To disable receive-side scaling in the network adapter driver in Windows Server 2003, follow these steps:
NOTE: There is no auto-tuning for 2003.
09-11-2014 03:19 PM
Hello All,
For us enabling tcp time stamp worked :)
No 14/24 since last 6 days. It looks like the issue is resolved.
Anyone else with same result?
09-17-2014 05:06 PM
Problem resolved here, by enabling TCP_TIMESTAMPS on the 5230 Media server: