cancel
Showing results for 
Search instead for 
Did you mean: 

Need urgent solution:Backup failed with 24

2013
Level 5

Hello Team,

NBU Master Server: 7.5.0.7

Client 7.5.07/ Client Name :test.123

Policy :policy.xxx

I have gone through all Tech Notes and found 2 mount points is creating issue and for that i have enabled logging to 5 and also created touch file bpbkar_path_tr and fired backup.It fails with 24.

Can someone please check the logs and suggest what needs to be excluded.So far there is no exclusion list.I have ached bpbkar logs and it seems there is some issue.Please can someone check...

 

1 ACCEPTED SOLUTION

Accepted Solutions

RonCaplinger
Level 6

Another source of these problems could be the TCP "ring buffers" and TCP offloading.  We tried using some Cisco UCS blades as media servers last year and had to switch back to physical hardware when we had intermittent status 24's and 2074's.  A couple of the steps we tried were to increase the TCP Ring Buffers to their max value, 4096, and disabled all TCP offloading. 

Here's a link describing how the ring buffers in Linux work:

http://www.linuxjournal.com/content/queueing-linux-network-stack

 

View solution in original post

36 REPLIES 36

2013
Level 5

Uploaded BPBKAR Logs

Marianne
Level 6
Partner    VIP    Accredited Certified

With this kind of issue we need logs on the media server as well: bptm and bpbrm.

We can see this error in bpbkar log:

ERR - Cannot write to STDOUT. Errno = 110: Connection timed out

What is Client Connect and Client Timeout on the media server?

There is about 15 minutes between the last file entry and the failure:

10:21:35.971 [24473] <2> bpbkar SelectFile: INF - cwd = /export/muse/home/OvidSPUS_template/backup
10:21:35.971 [24473] <2> bpbkar SelectFile: INF - path = PUBMED.1215558944584.bak
10:36:29.828 [23897] <16> flush_archive(): ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
10:36:29.828 [23897] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 24: socket write failed

What is the size of these .bak files?

Similar issues here:

http://www.symantec.com/docs/TECH74690 

https://www-secure.symantec.com/connect/forums/failed-error-24

https://www-secure.symantec.com/connect/forums/backup-failed-error-24cannot-write-stdout-errno-110-c...

 

 

2013
Level 5

Thanks for the reply.

What is the size of these .bak files?

-rw-r--r-- 1 webman webman 31583 Aug 4 2008 /export/muse/home/OvidSPUS_template/backup/PUBMED.1215558944584.bak

31K

-----------------------------------------------------------------------------------
Ipv6 is disabled on Linux box

-----------------------------------------------------------------------------------

Media Server:

Client connect timeout :9600
Client read Timeout    :9600

-------------------------------------------------------------------------------------

Client :

Client read timeout :9600

-------------------------------------------------------------------------------------

The Backup is going on disk.Also i will assure that there is no issue with Media server bcoz many backup are going on the same media server and its going good.Also there are other media servers too and when backup run it fails with 24.So it seems the issue with this client.

Also i found in bpbkar logs :

10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = class@input@input0
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = class@input@input1
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input2
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input3
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input4
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@misc@nvram
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:00.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:10.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:10.1
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:10.2
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:11.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:13.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:15.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:16.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1d.0@usb1@1-0:1.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1d.1@usb2@2-0:1.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1d.2@usb3@3-0:1.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1d.3@usb4@4-0:1.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1d.7@usb6@6-0:1.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:03.0
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.2
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.4@usb5@5-0:1.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.4@usb5@5-1@5-1:1.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.4@usb5@5-1@5-1:1.1
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.4@usb5@5-2@5-2:1.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1e.0@0000:01:04.6
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pci0000:00@0000:00:1f.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@platform@i8042
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@platform@i8042@serio0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@platform@ipmi_bmc.0000.17
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@platform@serial8250
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@platform@vesafb.0
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:00
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:01
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:02
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:03
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:04
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:05
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.067 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:06
10:04:32.068 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.068 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:07
10:04:32.068 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.068 [7762] <2> bpbkar SelectFile: INF - path = devices@pnp0@00:08
10:04:32.068 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed

------------------------------------------------------------------------------------------------------------------------------

Even i excluded Path:

cat /usr/openv/netbackup/exclude_list
/export/muse/home/OvidSPUS_template/backup/

No issue with permission.But still it failed with 24.

 

2013
Level 5

Please find the latest BPBKAR logs after excluding the path:/export/muse/home/OvidSPUS_template/backup

2013
Level 5

Thanks for the reply.

What is the size of these .bak files?

-rw-r--r-- 1 webman webman 31583 Aug 4 2008 /export/muse/home/OvidSPUS_template/backup/PUBMED.1215558944584.bak

31K
*********************************
Ipv6 is disabled on Linux box

------------------------------
Media Server:

Client connect timeout :9600
Client read Timeout    :9600

*********************************

Client read timeout :9600

The Backup is going on disk.Also i will assure that there is no issue with Media server bcoz many backup are going on the same media server and its going good.Also there are other media servers too and when backup run it fails with 24.So it seems the issue with this client.

Also i found in bpbkar logs :

10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = class@input@input0
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.065 [7762] <2> bpbkar SelectFile: INF - path = class@input@input1
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input2
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input3
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@input@input4
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - path = class@misc@nvram
10:04:32.066 [7762] <2> bpbkar SelectFile: INF - cwd = /dev/.udev/failed

----------------------------------------------------------------------------

Even i excluded Path.Also i attached the logs after excluding the below path.

cat /usr/openv/netbackup/exclude_list
/export/muse/home/OvidSPUS_template/backup/


No issue with permission.But still it failed with 24.

rookie11
Moderator
Moderator
   VIP   

could u please try MP1 and MP2 in separate policy and increase client connect time put setting to maximum. give it a try. even if it fails . bpbkar logs would be easy to diagnose

2013
Level 5

i did not get you ...What do you mean by MP1 and MP2..can you please explain...

2013
Level 5

Currently i am backing it up 2 different mount points by enabling multiple streams in single policy....Do you want me to create 2 policies and fire one mount point one by one.

Marianne
Level 6
Partner    VIP    Accredited Certified

...... found 2 mount points is creating issue

Lets go back to your original post...

Have you tried to create separate policies or streams for these 2 mount points?
Before starting separate backup, rename existing bpbkar log to ensure new log is created for each mount point.

/dev should be excluded from backups - device files are automatically created when OS is installed.
So, there will never be a reason to restore /dev.

Anything 'different' about /export?
What type of filesystem?

I seem to remember other forum posts about issues with /export....
Let me see if I can find it....

We still see 15 minutes between last entry in bpbkar and the failure:

02:54:16.596 [16463] <2> fscp_is_tracked: disabled tla_init
03:10:17.822 [16463] <16> flush_archive(): ERR - Cannot write to STDOUT. Errno = 110: Connection timed out 

So - there is 15 minute timeout 'somewhere'. 

Have you enabled bptm and bpbrm logs on the media server? 
We may need to look at those as well.

Is there a firewall anywhere in the picture?

What is OS KeepAlive setting on the client?

Mark_Solutions has previouly posted these recommendations. Please implement this and see if it makes a difference:

# echo 510 > /proc/sys/net/ipv4/tcp_keepalive_time

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes

The changes would be rendered persistent with an addition such as the following to /etc/sysctl.conf

 ## Keepalive at 8.5 minutes

 # start probing for heartbeat after 8.5 idle minutes (default 7200 sec)

net.ipv4.tcp_keepalive_time=510

 # close connection after 4 unanswered probes (default 9)

net.ipv4.tcp_keepalive_probes=3 

 # wait 45 seconds for reponse to each probe (default 75)

net.ipv4.tcp_keepalive_intvl=3

and then run : chkconfig boot.sysctl on

2013
Level 5

Currently i am backing it up 2 different mount points by enabling multiple streams in single policy.However I will create 2 policies and rename bpbkar and send the logs.

as this is a prod server..server will get rebooted in another 3 days...So do i wait for reboot or make the necessary changes which you suggest.

What is OS KeepAlive setting on the client?

net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

Firewall is not there.

Export Full Data :29 GB and in / 7.5 GB.

Marianne
Level 6
Partner    VIP    Accredited Certified

You can go ahead and make the changes.

The /etc/sysctl.conf entries are needed to make the changes persistent across reboots.

2013
Level 5

Marianne,

To implement the below changes customer is looking for a Symantec doc:

Mark_Solutions has previouly posted these recommendations. Please implement this and see if it makes a difference:

# echo 510 > /proc/sys/net/ipv4/tcp_keepalive_time

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes

The changes would be rendered persistent with an addition such as the following to /etc/sysctl.conf

 ## Keepalive at 8.5 minutes

 # start probing for heartbeat after 8.5 idle minutes (default 7200 sec)

net.ipv4.tcp_keepalive_time=510

 # close connection after 4 unanswered probes (default 9)

net.ipv4.tcp_keepalive_probes=3 

 # wait 45 seconds for reponse to each probe (default 75)

net.ipv4.tcp_keepalive_intvl=3

and then run : chkconfig boot.sysctl on

Can you please help me to get this.

2013
Level 5

Marianne,

Can you please give me any Symantec doc for OS KeepAlive setting so that we can implement the changes.This is a customer requirement.

Mark_Solutions
Level 6
Partner Accredited Certified

2013 - these setting have come from experience and from contact with Symantec backline support over many years of using NetBackup - not sure if they appear in a document anywhere.

Your backup clearly fails after a 15 minute timeout so something is closing down the ports after that period of time (15 minutes = 900 seconds)

The usual cause is a firewall - a hardware one that sits on the network between the client and th emedia server but it can also be NetBackup timeout settings.

As the timeouts are generally governed by the media server setting you have already exceeded all of these - but out of interest if you conennect to the clients host properties what are its timeouts?

The most likely is a network or operating system error and as other jobs work that is likely to be on this client or between this client and the media server

Try the setting for keep alive - no need for the anything other than the below for your test and you can change them back if it doesn't help:

# echo 510 > /proc/sys/net/ipv4/tcp_keepalive_time

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes

mph999
Level 6
Employee Accredited

I'm not aware that Symantec has such a document -  these are not NBU settings, they are OS settings so really you should address and investigate using your OS support vendor - NBU is the casualty in this issue, not the cause.

After all, if you had toothache, you would goto the dentist, not the doctor ...

The examples I have for these types of settings have been customer specific, that is set to an individual environment.

I have attached some notes for you on these sort of settings.

 

Mark_Solutions
Level 6
Partner Accredited Certified

And to be doubly sure can we see the output of this command on yoru client please:

# iptables -L -n

2013
Level 5

I did it but no luck.

# echo 510 > /proc/sys/net/ipv4/tcp_keepalive_time

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes

Again it hangs after 15 min....

iptables are disabled everywhere in linux environment.we are relying on network firewall....

mph999
Level 6
Employee Accredited

You need to investigate this issue then with your OS support and Network admins.

In the same way that Symantec would provide details for 'NetBackup tuning settings, the OS provider should assist you with OS settings.

Mark_Solutions
Level 6
Partner Accredited Certified

"we are relying on network firewall...."

Does that lie between the client and the media server?

Did you check the client host properties timeout settings?