03-09-2017 06:19 AM
I have backup issue
read from input socket failed(636)
8.3.2017. 22:37:48 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
8.3.2017. 22:37:50 - Info bptm(pid=12836) waited for full buffer 22367 times, delayed 111543 times
8.3.2017. 22:37:51 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
8.3.2017. 22:37:51 - Info bptm(pid=12836) EXITING with status 0 <----------
This solution did not help
https://www.veritas.com/support/en_US/article.000018102
03-09-2017 06:25 AM
03-09-2017 06:34 AM
what information You need?
03-09-2017 06:47 AM
Details about failing backup -
Policy type
If database - is this the parent or child job? Status of other jobs in this policy run?
How long before backup fails?
Please show us ALL text in Details tab of failed job.
OS and NBU versions on Master, Media server and client.
Is there a firewall anywhere in the picture?
If so, what is the firewall timeout value?
Have you modified KeepAliveTime on Master, Media server and Client?
Were the systems restarted after the change?
When starting a new discussion, please keep point (B) in mind of this blog (bearing in mind that we don't know anything of your environment):
https://vox.veritas.com/t5/NetBackup/NetBackup-Basics-and-how-to-make-YOUR-life-easier/td-p/491393
03-09-2017 06:57 AM
Policy type- MS Windows
no fw between Master-Media-Client
Win2008 R2 SP1, Netbackup 7.6.0.1 on Master-Media
Win2008 R2 SP1, Netbackup 7.5.0.5 on client
KeepAliveTime (900000 milliseconds) are set on master and media server (with restart after changed)
8.3.2017. 22:00:25 - Info nbjm(pid=6596) starting backup job (jobid=1262303) for client hrs000201.sitsremote.net, policy EBC_Data_Hrs000201.sitsremote.net, schedule DataDB_Inc
8.3.2017. 22:02:41 - estimated 28271390 Kbytes needed
8.3.2017. 22:02:41 - Info nbjm(pid=6596) started backup (backupid=hrs000201.sitsremote.net_1489006961) job for client hrs000201.sitsremote.net, policy EBC_Data_Hrs000201.sitsremote.net, schedule DataDB_Inc on storage unit dd860bj-Data_SQL
8.3.2017. 22:02:42 - started process bpbrm (9572)
8.3.2017. 22:02:48 - Info bpbrm(pid=9572) hrs000201.sitsremote.net is the host to backup data from
8.3.2017. 22:02:48 - Info bpbrm(pid=9572) reading file list for client
8.3.2017. 22:02:49 - connecting
8.3.2017. 22:02:53 - Info bpbrm(pid=9572) starting bpbkar32 on client
8.3.2017. 22:02:53 - connected; connect time: 0:00:04
8.3.2017. 22:02:55 - Info bpbkar32(pid=2160) Backup started
8.3.2017. 22:02:55 - Info bptm(pid=12836) start
8.3.2017. 22:02:55 - Info bptm(pid=12836) using 1048576 data buffer size
8.3.2017. 22:02:55 - Info bptm(pid=12836) setting receive network buffer to 4195328 bytes
8.3.2017. 22:02:55 - Info bptm(pid=12836) using 32 data buffers
8.3.2017. 22:02:59 - Info bptm(pid=12836) start backup
8.3.2017. 22:03:01 - Info bptm(pid=12836) backup child process is pid 11448.10496
8.3.2017. 22:03:01 - Info bptm(pid=11448) start
8.3.2017. 22:03:01 - begin writing
8.3.2017. 22:03:05 - Info bpbkar32(pid=2160) change journal NOT enabled for <E:\WindowsImageBackup>
read from input socket failed(636)
8.3.2017. 22:37:48 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
8.3.2017. 22:37:50 - Info bptm(pid=12836) waited for full buffer 22367 times, delayed 111543 times
8.3.2017. 22:37:51 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
8.3.2017. 22:37:51 - Info bptm(pid=12836) EXITING with status 0 <----------
8.3.2017. 22:37:51 - Info bpbrm(pid=9572) validating image for client hrs000201.sitsremote.net
8.3.2017. 22:37:57 - Error bpbrm(pid=9572) could not write EXIT STATUS to OUTSOCK
8.3.2017. 22:37:57 - Info bpbkar32(pid=2160) done. status: 0: the requested operation was successfully completed
03-09-2017 07:48 AM
Is this a new issue? Were the backups running successfully before?
Is this the only client that is seeing the status 636 error during a backup?
03-09-2017 11:24 PM
Yes, before backups was running successfully.
No, this happens on random clients.
03-09-2017 11:27 PM
It will be nice to see reply to @GeForce123 questions.
We need to determine if the issue is with ne or many clients, this specific media server, or more than one media server.
Looking at job details, seems to indicate that data is backed up successfully to the media server, but that the media server is having problems updating the master server with catalog info:
...bpbrm... could not write FILE ADDED message to OUTSOCK
Do you have bpbrm log on the media server? Level 3 should be fine.
03-09-2017 11:33 PM
I try to change media server but problem persist.
In this job media server is also master server.
03-09-2017 11:44 PM
I will download the file and have a look when time permits.
I found similar issue in this forum in the meantime:
https://vox.veritas.com/t5/NetBackup/status-636-status-42-errors-backing-up-to-MSDP-pool/m-p/568773
Unfortunately the post was never updated by the OP.
Please have a look at all of the suggestestions (including the other posts that I have referenced).
The last post by TJ_Henning seems like a good idea.
03-10-2017 12:04 AM
The bpbrm logging level is too low. There is nothing else other than what we see in Job Details.
Please ensure that bpdbm log on master server exists as well.
(Restart for increased logging level will be needed.)
You will need to trace comms between bpbrm and bpdbm.
PS:
Are you aware that all NBU levels up to 7.6.1.x reached EOSL on 1 Feb?
03-10-2017 07:20 AM
@Vjeksa wrote:Yes, before backups was running successfully.
No, this happens on random clients.
Before what? What changes were made?
Are these backups going to tape or disk?
03-13-2017 04:03 AM
No changes was made (except win update)
Backups are going on disk.
03-13-2017 09:35 PM
vjeksa, you really need to look more into what Marianne's has suggested so far.. plenty of links she mentioned to check, not to mention looking into logs.
8.3.2017. 22:37:48 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
8.3.2017. 22:37:50 - Info bptm(pid=12836) waited for full buffer 22367 times, delayed 111543 times
8.3.2017. 22:37:51 - Error bpbrm(pid=9572) could not write FILE ADDED message to OUTSOCK
That bold message indicates a performance issue reading from your client - can be either slow network or very slow read I/O on the client. That explains why there is this timeout issue (error 636).
I see that you increased TCP keepalive time, good but didn't help. What about client_read_timeout on both master server & client? Have you increased that? If that didn't help either you should focus at the network or client read I/O.
03-17-2017 01:04 AM
Clinet read time out are set to 1800 s on both side.
Here are one new example with logs
16.3.2017. 22:31:42 - Info nbjm(pid=6592) starting backup job (jobid=1277462) for client HRSNBEBCM1, policy EBC_Data_HRSNBEBCM.erste.hr1-2, schedule DataDB_Inc
16.3.2017. 22:31:42 - Info nbjm(pid=6592) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1277462, request id:{B1D64C47-6B05-4D5F-BEFB-80930DC18A50})
16.3.2017. 22:31:42 - requesting resource dd860bj-Data_SQ-Hrs00076
16.3.2017. 22:31:42 - requesting resource filebackup.NBU_CLIENT.MAXJOBS.HRSNBEBCM1
16.3.2017. 22:31:42 - requesting resource filebackup.NBU_POLICY.MAXJOBS.EBC_Data_HRSNBEBCM.erste.hr1-2
16.3.2017. 22:31:43 - granted resource filebackup.NBU_CLIENT.MAXJOBS.HRSNBEBCM1
16.3.2017. 22:31:43 - granted resource filebackup.NBU_POLICY.MAXJOBS.EBC_Data_HRSNBEBCM.erste.hr1-2
16.3.2017. 22:31:43 - granted resource MediaID=@aaab7;DiskVolume=Data&SQL;DiskPool=dd860bj-Data_SQL;Path=Data&SQL;StorageServer=dd860bj.erste.hr;MediaServer=filebackup
16.3.2017. 22:31:43 - granted resource dd860bj-Data_SQ-Hrs00076
16.3.2017. 22:31:48 - estimated 13505382 Kbytes needed
16.3.2017. 22:31:48 - Info nbjm(pid=6592) started backup (backupid=HRSNBEBCM1_1489699908) job for client HRSNBEBCM1, policy EBC_Data_HRSNBEBCM.erste.hr1-2, schedule DataDB_Inc on storage unit dd860bj-Data_SQ-Hrs00076
16.3.2017. 22:31:48 - started process bpbrm (10804)
16.3.2017. 22:31:48 - started
16.3.2017. 22:32:35 - Info bpbrm(pid=10804) HRSNBEBCM1 is the host to backup data from
16.3.2017. 22:32:43 - Info bpbrm(pid=10804) reading file list for client
16.3.2017. 22:32:43 - connecting
16.3.2017. 22:35:06 - Info bpbrm(pid=10804) starting bpbkar32 on client
16.3.2017. 22:35:06 - connected; connect time: 0:02:23
16.3.2017. 22:35:07 - Info bpbkar32(pid=45844) Backup started
16.3.2017. 22:35:07 - Info bpbkar32(pid=45844) change time comparison:<disabled>
16.3.2017. 22:35:07 - Info bpbkar32(pid=45844) archive bit processing:<enabled>
16.3.2017. 22:35:08 - Info bptm(pid=14940) start
16.3.2017. 22:35:08 - Info bptm(pid=14940) using 1048576 data buffer size
16.3.2017. 22:35:08 - Info bptm(pid=14940) setting receive network buffer to 4195328 bytes
16.3.2017. 22:35:08 - Info bptm(pid=14940) using 32 data buffers
16.3.2017. 22:35:08 - Info bpbkar32(pid=45844) not using change journal data for <D:\Websites>: not enabled
16.3.2017. 22:35:10 - Info bptm(pid=14940) start backup
16.3.2017. 22:35:11 - Info bptm(pid=14940) backup child process is pid 7284.12560
16.3.2017. 22:35:11 - begin writing
16.3.2017. 22:35:11 - Info bptm(pid=7284) start
16.3.2017. 22:40:06 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
read from input socket failed(636)
16.3.2017. 22:50:59 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 22:58:04 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 22:58:16 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:00:16 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:02:16 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:02:45 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:04:16 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:04:21 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:04:46 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:05:04 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:06:08 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:06:38 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:06:43 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:07:29 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:08:01 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:08:35 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:10:17 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:10:23 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:12:07 - Info bptm(pid=6508) waited for full buffer 6512 times, delayed 173675 times
16.3.2017. 23:12:07 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:12:07 - Info bptm(pid=6508) EXITING with status 0 <----------
16.3.2017. 23:12:07 - Info bpbrm(pid=8248) validating image for client HRSNBEBCM1
16.3.2017. 23:12:33 - Error bpbrm(pid=8248) could not write EXIT STATUS to OUTSOCK
16.3.2017. 23:12:33 - Info bpbkar32(pid=37620) done. status: 0: the requested operation was successfully completed
16.3.2017. 23:13:22 - Info bptm(pid=14940) waited for full buffer 6502 times, delayed 129879 times
16.3.2017. 23:13:22 - Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
16.3.2017. 23:13:24 - Info bptm(pid=14940) EXITING with status 0 <----------
16.3.2017. 23:13:24 - Info bpbrm(pid=10804) validating image for client HRSNBEBCM1
16.3.2017. 23:13:50 - Error bpbrm(pid=10804) could not write EXIT STATUS to OUTSOCK
16.3.2017. 23:13:50 - Info bpbkar32(pid=45844) done. status: 0: the requested operation was successfully completed
03-17-2017 02:56 AM
@Vjeksa It seems you chose to completely ignore my previous posts.
There is nothing wrong with media server <-> client comms and data transfer - both bptm and bpbkar exited with status 0.
So, Client Read Timeout is not going to fix anything.
This error is about catalog updates between bpbrm (on media server or master/media server) and bpdbm (on master server):
Error bpbrm(pid=10804) could not write FILE ADDED message to OUTSOCK
If nothing that I posted previously is helping, feel free to paste this into Google and see what comes up:
Netbackup Error bpbrm could not write FILE ADDED message to OUTSOCK
03-17-2017 03:19 AM
From one of your examples:
16.3.2017. 22:35:11 - Info bptm(pid=7284) start
16.3.2017. 22:40:06 - Error bpbrm(pid=8248) could not write FILE ADDED message to OUTSOCK
read from input socket failed(636)
.. so the failure is after 5 mins, almost exact which looks like a timeout to me.
Perhaps check what the OS tcp keepalive is, if 5 mins, double it to 10 and see if that either solves the issue, or, if the error then happens after 10 mins. No matter which, if it does not solve the issue, change it back to the original value (so we don;t make changes that don;t fix the issue, but may introduce ther issues).
Is there a firewall between the media server and the master ?
03-17-2017 01:37 PM
Please check once the particular media server status, it should be in ONLINE state. some times if the server went to OFFLINE state then this error will occur as per my past experiance.
check the status from vmoprcmd cmd
03-18-2017 12:33 PM
media server is on online state
here is example when backup does not end succesfully
18.3.2017. 20:09:37 - Error bpbrm(pid=2300) could not write FILE ADDED message to OUTSOCK
18.3.2017. 20:12:30 - Error bpbrm(pid=2300) db_FLISTsend failed: no entity was found (227)
18.3.2017. 20:12:33 - Error bpbrm(pid=2300) cannot send mail to kralj@esb.hr
18.3.2017. 20:12:39 - Error bpbrm(pid=2300) could not write EXIT STATUS to OUTSOCK
18.3.2017. 20:12:39 - Info bpbkar32(pid=5876) done. status: 227: no entity was found
03-18-2017 12:39 PM
bpbrm from media server