cancel
Showing results for 
Search instead for 
Did you mean: 

All windows servers backups failed with file read error - Writing some data and failing frequently...

Tekkali
Level 4

Hi All

Configured netbackup clients in VLAN setup. few servers failed frequently with EC : 25/24/23..

Some policies failed with 25 on this disabled all multiple streams and it's successful daily.

24 issues detailed log shows :

10/12/2014 02:00:00 - Info nbjm (pid=19945) starting backup job (jobid=12270850) for client XXXXX, policy BPMusic-backup-0003, schedule Client-Daily
10/12/2014 02:00:00 - Info nbjm (pid=19945) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=12270850, request id:{45BEFCCA-5157-11E4-902E-00144F68F6F6})
10/12/2014 02:00:00 - requesting resource LIBRARY-01-lto3-NBU-001
10/12/2014 02:00:00 - requesting resource MASTER.NBU_CLIENT.MAXJOBS.XXXXX
10/12/2014 02:00:00 - requesting resource MASTER.NBU_POLICY.MAXJOBS.BPMusic-backup-0003
10/12/2014 02:00:16 - awaiting resource LIBRARY-01-lto3-NBU-001. No drives are available.
10/12/2014 02:03:52 - awaiting resource LIBRARY-01-lto3-NBU-001. Waiting for resources.  
          Reason: Drives are in use, Media server: MEDIASERVER,  
          Robot Type(Number): ACS(1), Media ID: N/A, Drive Name: N/A,  
          Volume Pool: EDN-Media, Storage Unit: MEDIASERVER-r2m03L85-01-lto3-NBU-001, Drive Scan Host: N/A,  
          Disk Pool: N/A, Disk Volume: N/A  
10/12/2014 02:04:12 - awaiting resource LIBRARY-01-lto3-NBU-001. No drives are available.
10/12/2014 02:48:40 - granted resource  MASTER.NBU_CLIENT.MAXJOBS.XXXXX
10/12/2014 02:48:40 - granted resource  MASTER.NBU_POLICY.MAXJOBS.BPMusic-backup-0003
10/12/2014 02:48:40 - granted resource  A02881
10/12/2014 02:48:40 - granted resource  01_01000101_lt03
10/12/2014 02:48:40 - granted resource  MEDIASERVER-r2m03L85-01-lto3-NBU-001
10/12/2014 02:48:40 - estimated 0 kbytes needed
10/12/2014 02:48:40 - Info nbjm (pid=19945) started backup (backupid=XXXXX_1413042520) job for client XXXXX, policy BPMusic-backup-0003, schedule Client-Daily on storage unit MEDIASERVER-r2m03L85-01-lto3-NBU-001
10/12/2014 02:48:42 - Info bpbrm (pid=5121) XXXXX is the host to backup data from
10/12/2014 02:48:44 - Info bpbrm (pid=5121) telling media manager to start backup on client
10/12/2014 02:48:44 - Info bptm (pid=5136) using 262144 data buffer size
10/12/2014 02:48:44 - Info bptm (pid=5136) using 12 data buffers
10/12/2014 02:48:45 - Info bpbrm (pid=5121) spawning a brm child process
10/12/2014 02:48:45 - Info bpbrm (pid=5121) child pid: 13910
10/12/2014 02:48:45 - Info bpbrm (pid=5121) sending bpsched msg: CONNECTING TO CLIENT FOR XXXXX_1413042520
10/12/2014 02:48:45 - connecting
10/12/2014 02:48:46 - Info bpbrm (pid=5121) start bpbkar on client
10/12/2014 02:48:46 - Info bptm (pid=13909) setting receive network buffer to 262144 bytes
10/12/2014 02:48:46 - connected; connect time: 0:00:00
10/12/2014 02:48:46 - begin writing
10/12/2014 02:48:49 - Info bpbkar (pid=8132) Backup started
10/12/2014 02:48:49 - Info bpbrm (pid=5121) Sending the file list to the client
10/12/2014 02:49:46 - Info bpbkar (pid=8132) change journal NOT enabled for <C:\>
10/12/2014 03:09:13 - Critical bpbrm (pid=13910) from client XXXXX: FTL - socket write failed
10/12/2014 03:09:13 - Info bpbrm (pid=5121) sending message to media manager: STOP BACKUP XXXXX_1413042520
10/12/2014 03:09:14 - Info bpbrm (pid=5121) media manager for backup id XXXXX_1413042520 exited with status 150: termination requested by administrator
10/12/2014 03:09:14 - end writing; write time: 0:20:28
10/12/2014 03:09:17 - Info bpbrm (pid=18990) Starting delete snapshot processing
10/12/2014 03:09:17 - Info bpfis (pid=0) Snapshot will not be deleted
10/12/2014 03:13:43 - Error bpbrm (pid=18990) from client XXXXX: ERR - Get bpfis state from MASTER failed. status = 25
10/12/2014 03:13:53 - Info bpfis (pid=5376) Backup started
10/12/2014 03:13:53 - Critical bpbrm (pid=18990) from client XXXXX: FTL - cannot open C:\Program Files\VERITAS\NetBackup\online_util\fi_cntl\bpfis.fim.XXXXX_1413042520.1.0
10/12/2014 03:13:53 - Info bpfis (pid=5376) done. status: 1542
10/12/2014 03:13:53 - Info bpfis (pid=0) done. status: 1542: An existing snapshot is no longer valid and cannot be mounted for subsequent operations
socket write failed  (24)

 

Few dirves having issue working. I think it's not drive issues ? Kindly please provide solution...

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified

All of your errors are network errors. Nothing to do with drives.

The fact that backups go through when you disable multistreaming points to overload on the network and/or problem with NIC settings on clients.

We also see bpfis unable to update snapshot on the master. For WOFB, the clients need connectivity to the master server. No way around that. Either enable connectivity (public or backup LAN) or disable WOFB.

View solution in original post

2 REPLIES 2

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello

 

Its most likely a timeout issue. Your backup job waited for almost 45 minutes before the drives became available. Try run the backup again when there are drives available to test.

 

0/12/2014 02:03:52 - awaiting resource LIBRARY-01-lto3-NBU-001. Waiting for resources.  
          Reason: Drives are in use, Media server: MEDIASERVER,  
          Robot Type(Number): ACS(1), Media ID: N/A, Drive Name: N/A,  
          Volume Pool: EDN-Media, Storage Unit: MEDIASERVER-r2m03L85-01-lto3-NBU-001, Drive Scan Host: N/A,  
          Disk Pool: N/A, Disk Volume: N/A  
10/12/2014 02:04:12 - awaiting resource LIBRARY-01-lto3-NBU-001. No drives are available.
10/12/2014 02:48:40 - granted resource  MASTER.NBU_CLIENT.MAXJOBS.XXXXX
10/12/2014 02:48:40 - granted resource  MASTER.NBU_POLICY.MAXJOBS.BPMusic-backup-0003

Marianne
Level 6
Partner    VIP    Accredited Certified

All of your errors are network errors. Nothing to do with drives.

The fact that backups go through when you disable multistreaming points to overload on the network and/or problem with NIC settings on clients.

We also see bpfis unable to update snapshot on the master. For WOFB, the clients need connectivity to the master server. No way around that. Either enable connectivity (public or backup LAN) or disable WOFB.