cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

snapshot error encountered(156) and network read failed(42)

soeheinhtut
Level 4

Hello All

Pls help to resolve  

==========

8/26/2020 11:00:00 PM - Info nbjm(pid=3856) starting backup job (jobid=26898) for client cbmoai-flsv01p, policy SYSTEM1-cbmoai-flsv01-SYS, schedule SYS_Differential
8/26/2020 11:00:00 PM - Info nbjm(pid=3856) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=26898, request id:{8ADB56C3-D76C-4F7C-920A-7BBC691D9B9B})
8/26/2020 11:00:00 PM - requesting resource SYSTEM1-cbmoai-flsv01
8/26/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p
8/26/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-SYS
8/26/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p
8/26/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-SYS
8/26/2020 11:00:00 PM - granted resource MediaID=@aaac1;Path=\\cbmoai-nasc01p03\sys_backup\cbmoai-flsv01-DIR;MediaServer=cbmoai-bkup01p
8/26/2020 11:00:00 PM - granted resource SYSTEM1-cbmoai-flsv01
8/26/2020 11:00:00 PM - estimated 780933 Kbytes needed
8/26/2020 11:00:00 PM - begin Parent Job
8/26/2020 11:00:00 PM - begin VMware, Start Notify Script
8/26/2020 11:00:00 PM - Info RUNCMD(pid=5616) started
8/26/2020 11:00:00 PM - Info RUNCMD(pid=5616) exiting with status: 0
Status 0
8/26/2020 11:00:00 PM - end VMware, Start Notify Script; elapsed time: 0:00:00
8/26/2020 11:00:00 PM - begin VMware, Step By Condition
Status 0
8/26/2020 11:00:00 PM - end VMware, Step By Condition; elapsed time: 0:00:00
8/26/2020 11:00:00 PM - begin VMware, Read File List
Status 0
8/26/2020 11:00:00 PM - end VMware, Read File List; elapsed time: 0:00:00
8/26/2020 11:00:00 PM - begin VMware, Create Snapshot
8/26/2020 11:00:00 PM - started
8/26/2020 11:00:09 PM - started process bpbrm (1216)
8/26/2020 11:00:23 PM - Info bpbrm(pid=1216) cbmoai-flsv01p is the host to backup data from
8/26/2020 11:00:26 PM - Info bpbrm(pid=1216) reading file list for client
8/26/2020 11:00:26 PM - Info bpbrm(pid=1216) start bpfis on client
8/26/2020 11:00:31 PM - Info bpbrm(pid=1216) Starting create snapshot processing
8/26/2020 11:00:36 PM - Info bpfis(pid=7952) Backup started
8/26/2020 11:00:39 PM - snapshot backup of client cbmoai-flsv01p using method VMware_v2
8/26/2020 11:06:55 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - VMware_freeze: VIXAPI freeze (VMware snapshot) failed with -1: Unrecognized error
8/26/2020 11:07:00 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - VMware error received: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.
8/26/2020 11:07:04 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze
8/26/2020 11:07:09 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL -
8/26/2020 11:07:14 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze
8/26/2020 11:07:19 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL -
8/26/2020 11:07:23 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - snapshot processing failed, status 156
8/26/2020 11:07:28 PM - Critical bpbrm(pid=1216) from client cbmoai-flsv01p: FTL - snapshot creation failed, status 156
8/26/2020 11:07:33 PM - Warning bpbrm(pid=1216) from client cbmoai-flsv01p: WRN - ALL_LOCAL_DRIVES is not frozen
8/26/2020 11:07:37 PM - Info bpfis(pid=7952) done. status: 156
8/26/2020 11:07:37 PM - end VMware, Create Snapshot; elapsed time: 0:07:37
8/26/2020 11:07:37 PM - Info bpfis(pid=7952) done. status: 156: snapshot error encountered
8/26/2020 11:07:37 PM - end writing
Status 156
8/26/2020 11:07:37 PM - end Parent Job; elapsed time: 0:07:37
8/26/2020 11:07:37 PM - begin VMware, Stop On Error
Status 0
8/26/2020 11:07:37 PM - end VMware, Stop On Error; elapsed time: 0:00:00
8/26/2020 11:07:37 PM - begin VMware, Delete Snapshot
8/26/2020 11:07:50 PM - started process bpbrm (7872)
8/26/2020 11:08:13 PM - Info bpbrm(pid=7872) Starting delete snapshot processing
8/26/2020 11:08:23 PM - Info bpfis(pid=5584) Backup started
8/26/2020 11:08:23 PM - Critical bpbrm(pid=7872) from client cbmoai-flsv01p: cannot open C:\Program Files\Veritas\NetBackup\online_util\fi_cntl\bpfis.fim.cbmoai-flsv01p_1598459400.1.0
8/26/2020 11:08:28 PM - Info bpfis(pid=5584) done. status: 4207
8/26/2020 11:08:28 PM - end VMware, Delete Snapshot; elapsed time: 0:00:51
8/26/2020 11:08:28 PM - Info bpfis(pid=5584) done. status: 4207: Could not fetch snapshot metadata or state files
8/26/2020 11:08:28 PM - end writing
Status 4207
8/26/2020 11:08:28 PM - end operation
Status 156
8/26/2020 11:08:28 PM - end operation
snapshot error encountered(156)

============

8/26/2020 11:00:00 PM - Info nbjm(pid=3856) starting backup job (jobid=26899) for client cbmoai-flsv01p00, policy SYSTEM1-cbmoai-flsv01-DATA, schedule DATA_Differential
8/26/2020 11:00:00 PM - Info nbjm(pid=3856) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=26899, request id:{70DCFE59-C5CA-4A11-831C-9DA88550778F})
8/26/2020 11:00:00 PM - requesting resource SYSTEM1-cbmoai-flsv01-DATA
8/26/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p00
8/26/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-DATA
8/26/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p00
8/26/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-DATA
8/26/2020 11:00:00 PM - granted resource MediaID=@aaacK;Path=\\cbmoai-nasc01p03\data_backup\flsv;MediaServer=cbmoai-bkup01p
8/26/2020 11:00:00 PM - granted resource SYSTEM1-cbmoai-flsv01-DATA
8/26/2020 11:00:01 PM - estimated 941169 Kbytes needed
8/26/2020 11:00:01 PM - Info nbjm(pid=3856) started backup (backupid=cbmoai-flsv01p00_1598459401) job for client cbmoai-flsv01p00, policy SYSTEM1-cbmoai-flsv01-DATA, schedule DATA_Differential on storage unit SYSTEM1-cbmoai-flsv01-DATA
8/26/2020 11:00:11 PM - started process bpbrm (7552)
8/26/2020 11:00:31 PM - Info bpbrm(pid=7552) cbmoai-flsv01p00 is the host to backup data from
8/26/2020 11:00:35 PM - connecting
8/26/2020 11:00:36 PM - Info bpbrm(pid=7552) reading file list for client
8/26/2020 11:00:46 PM - Info bpbrm(pid=7552) starting bpbkar32 on client
8/26/2020 11:00:46 PM - connected; connect time: 0:00:11
8/26/2020 11:00:49 PM - Info bpbkar32(pid=3328) Backup started
8/26/2020 11:00:49 PM - Info bpbkar32(pid=3328) change time comparison:<disabled>
8/26/2020 11:00:49 PM - Info bpbkar32(pid=3328) archive bit processing:<enabled>
8/26/2020 11:00:49 PM - Info bpbkar32(pid=3328) not using change journal data for <D:\>: not enabled
8/26/2020 11:00:49 PM - Info bptm(pid=6228) start
8/26/2020 11:01:15 PM - Info bptm(pid=6228) using 262144 data buffer size
8/26/2020 11:01:15 PM - Info bptm(pid=6228) setting receive network buffer to 1049600 bytes
8/26/2020 11:01:15 PM - Info bptm(pid=6228) using 30 data buffers
8/26/2020 11:01:29 PM - Info bptm(pid=6228) start backup
8/26/2020 11:01:39 PM - Info bptm(pid=6228) backup child process is pid 6300.6324
8/26/2020 11:01:39 PM - Info bptm(pid=6300) start
8/26/2020 11:01:39 PM - begin writing
8/26/2020 11:02:03 PM - Error bptm(pid=6300) socket operation failed - 10054 (at child.c.1304)
8/26/2020 11:02:08 PM - Error bptm(pid=6300) unable to perform read from client socket, connection may have been broken
8/26/2020 11:02:44 PM - Info bptm(pid=6228) EXITING with status 42 <----------
8/26/2020 11:02:46 PM - Error bpbrm(pid=7552) could not send server status message
8/26/2020 11:04:18 PM - Info bpbkar32(pid=3328) done. status: 42: network read failed
8/26/2020 11:04:18 PM - end writing; write time: 0:02:39
network read failed(42)

12 REPLIES 12

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@soeheinhtut 

2 different issues for 2 different clients.

Please start a new discussion for status 42 issue? 

About snapshot error for VMware policy - this is normally a VMware issue. 
Maybe the VM is too busy and cannot be quiesced, no space to create VMware snapshot, RDM disk present, insufficient permissions, vmtools not installed,  etc. etc. 
Check Task Event in vCenter console for errors. 

NBU 8.1 manual: 
https://www.veritas.com/content/support/en_US/doc/21902280-127283730-0/v37089113-127283730

PS:
Please remember to post NetBackup issues in the NetBackup forum. 
I found your posts by chance in the General Veritas forum and moved them here. 

Dear Marianne,

Thanks for your reply and advice.

Yesterday I face again network read failed(42)

pls check and advise what logs I need to collect.

======================================

9/10/2020 11:00:00 PM - Info nbjm(pid=3856) starting backup job (jobid=27102) for client cbmoai-flsv01p00, policy SYSTEM1-cbmoai-flsv01-DATA, schedule DATA_Differential
9/10/2020 11:00:00 PM - Info nbjm(pid=3856) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=27102, request id:{DA0349DB-D968-42B7-8CEE-175146894082})
9/10/2020 11:00:00 PM - requesting resource SYSTEM1-cbmoai-flsv01-DATA
9/10/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p00
9/10/2020 11:00:00 PM - requesting resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-DATA
9/10/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_CLIENT.MAXJOBS.cbmoai-flsv01p00
9/10/2020 11:00:00 PM - granted resource cbmoai-bkmg01p.NBU_POLICY.MAXJOBS.SYSTEM1-cbmoai-flsv01-DATA
9/10/2020 11:00:00 PM - granted resource MediaID=@aaacK;Path=\\cbmoai-nasc01p03\data_backup\flsv;MediaServer=cbmoai-bkup01p
9/10/2020 11:00:00 PM - granted resource SYSTEM1-cbmoai-flsv01-DATA
9/10/2020 11:00:00 PM - estimated 1192522 Kbytes needed
9/10/2020 11:00:00 PM - Info nbjm(pid=3856) started backup (backupid=cbmoai-flsv01p00_1599755400) job for client cbmoai-flsv01p00, policy SYSTEM1-cbmoai-flsv01-DATA, schedule DATA_Differential on storage unit SYSTEM1-cbmoai-flsv01-DATA
9/10/2020 11:00:10 PM - started process bpbrm (5680)
9/10/2020 11:00:28 PM - Info bpbrm(pid=5680) cbmoai-flsv01p00 is the host to backup data from
9/10/2020 11:00:36 PM - Info bpbrm(pid=5680) reading file list for client
9/10/2020 11:00:36 PM - connecting
9/10/2020 11:00:46 PM - Info bpbrm(pid=5680) starting bpbkar32 on client
9/10/2020 11:00:46 PM - connected; connect time: 0:00:10
9/10/2020 11:00:49 PM - Info bpbkar32(pid=3920) Backup started
9/10/2020 11:00:49 PM - Info bpbkar32(pid=3920) change time comparison:<disabled>
9/10/2020 11:00:49 PM - Info bpbkar32(pid=3920) archive bit processing:<enabled>
9/10/2020 11:00:49 PM - Info bpbkar32(pid=3920) not using change journal data for <D:\>: not enabled
9/10/2020 11:00:49 PM - Info bptm(pid=7812) start
9/10/2020 11:01:15 PM - Info bptm(pid=7812) using 262144 data buffer size
9/10/2020 11:01:15 PM - Info bptm(pid=7812) setting receive network buffer to 1049600 bytes
9/10/2020 11:01:15 PM - Info bptm(pid=7812) using 30 data buffers
9/10/2020 11:01:29 PM - Info bptm(pid=7812) start backup
9/10/2020 11:01:39 PM - Info bptm(pid=7812) backup child process is pid 6896.7632
9/10/2020 11:01:39 PM - begin writing
9/10/2020 11:01:39 PM - Info bptm(pid=6896) start
9/10/2020 11:02:00 PM - Error bptm(pid=6896) socket operation failed - 10054 (at child.c.1304)
9/10/2020 11:02:04 PM - Error bptm(pid=6896) unable to perform read from client socket, connection may have been broken
9/10/2020 11:02:41 PM - Info bptm(pid=7812) EXITING with status 42 <----------
9/10/2020 11:02:44 PM - Error bpbrm(pid=5680) could not send server status message
9/10/2020 11:04:37 PM - Info bpbkar32(pid=3920) done. status: 42: network read failed
9/10/2020 11:04:37 PM - end writing; write time: 0:02:58
network read failed(42)

Hamza_H
Moderator
Moderator
   VIP   

For the error 42, check if the FW is enabled on the client, the below error is saying it all:

socket operation failed - 10054

if the FW is enabled, disable it and re-run a backup (check global network firewall, then local fw on the client machine)

quebek
Moderator
Moderator
   VIP    Certified

hello

for errors like this one 

socket operation failed - 10054

I would have checked the media server NIC drivers, firmware.

Same on ESXi hosts and possibly master... wont hurt for sure to make sure you have the latest - if not upgrade to such.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@soeheinhtut 

You need these logs:

On media server: bptm and bpbrm
On Backup Host (Normally media server as well) : bpbkar.

Best to increase logging levels to 3. 

The logs will help us understand where in the backup path did the network failure occur.
You then need to take this info to OS, VMware and network admins to investigate. 

Dear Marianne,

Pls kindly check the bptm, bpbrm and bpkar logs as attach

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The logging level seems quite low and the bpbkar log does not correspond with the Job Details.
There is nothing in bpbkar log after 11pm with PID 3920. 
Is the Media server the Backup host, or do you have a different backup host? 

What I have noticed is that you seem to be using NBU 7.6.0.3.
If your NBU version is that old, does it mean that everything else in your server and network infrastructure is also old and outdated? 

All logs seems to indicate external network failure:
bpbrm: An existing connection was forcibly closed by the remote host.
bptm: unable to perform read from client socket, connection may have been broken
bpbkar (at 6am) dtcp_read: TCP - failure: recv socket (1004) (TCP 10058: Can't send after socket shutdown)

As per @quebek 's post - check NIC drivers and firmware on all servers. 

Hello Marianne,

Thanks for your time and explanation,

I will increase the bpkar log level to get more information.

Media is Backup host and infra is outdated 

Netbackup is NBU 7.6.0.3.

VMWare is 5.5

Hi Marianne,

In my media server (backup host) GUI logging setting have only bptm and bpbrm. 

could you pls advise how can i increase bpkar log level. thanks

when i reading in communities about network read fail (42): some suggest to increase client read timeout setting in media and client server. if still happen advise to add resiliency network. should i increase and it can impact to other jobs?

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Log folders need to be created manually.

If you only see bptm and bpbrm, it's because other log folders (like bpbkar) was never created.

Logging levels are adjusted in Host Properties.

I doubt that the problem is with timeouts - we would've see evidence in bpbrm log.
Client Read and Client Connect timeouts are also set in Host Properties  -> Media Server.

hello Marianne,

bpkar folder is already created.

pls let me know if i change client read timeout to 3600,i worried about it will take one hour and can impact to other jobs?

current read timeout is 300

pls suggest

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I did not see any evidence in the logs that backups were terminated due to 5-min timeout.

You are welcome to increase timeouts (I would not change it to more than 900 or 1800 max).