cancel
Showing results for 
Search instead for 
Did you mean: 

Error code 24 on specified drives

wannawin
Level 6

Master server: Linux, 7.1 NBU

Media server : Linux, 7.1 NBU

Client: Windows server 2008 R2 standard, 7.1.0.3 NBU client version.

Backup of all drives except N:\ and M:\ completed successfully. Backup of N:\ and M:\ failing with error code 24 after writing 1112197120 Kb of data.

Below are the detailes status logs:

05/15/2014 13:59:07 - Info bpbrm (pid=8456) fngwsasdbpr01-bk is the host to backup data from
05/16/2014 07:31:18 - Info nbjm (pid=30140) starting backup job (jobid=4411865) for client fngwsasdbpr01-bk, policy FNFG_PROD_WIN_fngwsasdbpr01-bk, schedule DLY_DINC
05/16/2014 07:31:18 - Info nbjm (pid=30140) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=4411865, request id:{FAA6F9DE-DCF5-11E3-B838-A1C5900B4386})
05/16/2014 07:31:18 - requesting resource lfisnbmed_SPA008_STU
05/16/2014 07:31:18 - requesting resource fisnbmp011.NBU_CLIENT.MAXJOBS.fngwsasdbpr01-bk
05/16/2014 07:31:18 - requesting resource fisnbmp011.NBU_POLICY.MAXJOBS.FNFG_PROD_WIN_fngwsasdbpr01-bk
05/16/2014 07:31:18 - granted resource  fisnbmp011.NBU_CLIENT.MAXJOBS.fngwsasdbpr01-bk
05/16/2014 07:31:18 - granted resource  fisnbmp011.NBU_POLICY.MAXJOBS.FNFG_PROD_WIN_fngwsasdbpr01-bk
05/16/2014 07:31:18 - granted resource  MediaID=@aaaah;DiskVolume=PureDiskVolume;DiskPool=SPA008;Path=PureDiskVolume;StorageServer=nb5030-01...
05/16/2014 07:31:18 - granted resource  lfisnbmed_SPA008_STU
05/16/2014 07:31:18 - estimated 0 kbytes needed
05/16/2014 07:31:18 - Info nbjm (pid=30140) resumed backup (backupid=fngwsasdbpr01-bk_1400180341) job for client fngwsasdbpr01-bk, policy FNFG_PROD_WIN_fngwsasdbpr01-bk, schedule DLY_DINC on storage unit lfisnbmed_SPA008_STU
05/16/2014 07:31:19 - started process bpbrm (pid=29037)
05/16/2014 07:31:20 - connecting
05/16/2014 07:31:23 - connected; connect time: 0:00:00
05/16/2014 07:31:28 - Info bpbrm (pid=29037) starting bpbkar on client
05/16/2014 07:31:31 - Info bpbkar (pid=5176) Backup started
05/16/2014 07:31:31 - Info bpbrm (pid=29037) bptm pid: 29038
05/16/2014 07:31:31 - begin writing
05/16/2014 07:31:32 - Info bptm (pid=29038) start
05/16/2014 07:31:32 - Info bptm (pid=29038) using 1048576 data buffer size
05/16/2014 07:31:32 - Info bptm (pid=29038) using 256 data buffers
05/16/2014 07:31:35 - Info bptm (pid=29038) start backup
05/16/2014 07:31:37 - Info bptm (pid=29038) backup child process is pid 29043
05/16/2014 07:43:41 - Critical bpbrm (pid=29037) from client fngwsasdbpr01-bk: FTL - socket write failed
05/16/2014 07:45:03 - Error bptm (pid=29038) media manager terminated by parent process
05/16/2014 07:45:03 - end writing; write time: 0:13:32

bpbkar logs has been attached
05/16/2014 07:45:07 - Info lfisnbmed-04 (pid=29038) StorageServer=PureDisk:nb5030-015.fnfis.com; Report=PDDO Stats for (nb5030-015.fnfis.com): scanned: 19551235 KB, CR sent: 14 KB, CR sent over FC: 0 KB, dedup: 100.0%
05/16/2014 07:45:08 - Error bpbrm (pid=29037) could not send server status message
05/16/2014 07:45:09 - Info bpbkar (pid=5176) done. status: 24: socket write failed
socket write failed  (24)

6 REPLIES 6

GV89
Level 4
Partner Accredited Certified

There could be a lot of reasons that you could get status 24. Here are some links that could help

https://www-secure.symantec.com/connect/forums/socket-write-failed24-0

http://www.symantec.com/docs/TECH150369

http://www.symantec.com/docs/TECH34816

 

SymTerry
Level 6
Employee Accredited

GV89 mentioned TECH150369.

One the the symptoms listed there is what you will need to look at:

  • High network load
  • Intermittent connectivity
  • Packet reordering
  • Duplex mismatch between client and master server NICs
  • Small network buffer size 

It would help to get the bpbkar and bpbrm logs. Please make sure they are at Log level 5. That will give alot more information.

wannawin
Level 6

Backup of only two drives is failing then i don't think that that there is a issue with

  • High network load
  • Intermittent connectivity
  • Packet reordering
  • Duplex mismatch between client and master server NICs

SymTerry
Level 6
Employee Accredited

Your getting a socket write failed from your client (fngwsasdbpr01)-bk over what I assume is a fiber connection between the client and media server. That is a communication error and why your getting error 24.  

The causes listed are the common ones that cause this error, but getting the logs will help confirm what is going on regardless. bpbkar log is from your client and the bpbrm log is from the media server. By collecting these logs we will be able to get a better picture of what the conversation between the two looked like at the time of the socket write error.

Marianne
Level 6
Partner    VIP    Accredited Certified

When you post a fresh set of logs, please just copy log files to txt files. The Word docs are difficult to download and read on a mobile device.

Please also let us know what the Client Read Timeout on the media server is.

We see a checkpoint on the client, then nothing for more than 10 minutes, and then the socket errors:

8:34:20.293 AM: [7692.8276] <2> tar_backup_cpr::position: INF - checkpoint restart position - /L/SAS/SASDev/DataMart/Target/hh_accts_201110_201110.sas7bdat
8:45:23.168 AM: [5176.6264] <16> tar_tfi::processException: 
An Exception of type [SocketWriteException] has occured at:
  Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321
  Module: @(#) $Source: src/ncf/tfi/lib/Packer.cpp,v $ $Revision: 1.89 $ , Function: Packer::getBuffer(), Line: 656
  Module: tar_tfi::getBuffer, Function: H:\7103\src\cl\clientpc\util\tar_tfi.cpp, Line: 312
  Local Address: [::]:0
  Remote Address: [::]:0
  OS Error: 10060 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
)
  Expected bytes: 262144

8:45:23.168 AM: [5176.6264] <2> tar_base::V_vTarMsgW: FTL - socket write failed

wannawin
Level 6

Hello marianne.

Client read time out on media servers is... CLIENT_READ_TIMEOUT = 30000