05-22-2015 01:00 AM
Hi all
It is impossible for me to backup my server (windows 2008 r2 64 bits, size about 2TO).
I always have error 42.
thanks
Solved! Go to Solution.
06-25-2015 02:19 AM
Hi All,
the speed at the vmnic was at 100 full at ESX side. I forced it to 1000 full. Now the backup is running since tow hour and half without failed. hope it will be good.
05-22-2015 01:12 AM
NetBackup version? Master, and media, and client?
Activity monitor log to start with?
Does any data get moved?
Does it always fail at the same point?
What steps have you tried so far to resolve the issue?
Did it ever work?
Is SP1 installed on the client?
05-22-2015 02:37 AM
Is Windows server backing up itself or across the network to a media server?
Please show us all text in Details tab of failed job.
This will give indication of which logs will be required.
05-22-2015 04:37 AM
HI all,
i tried somethin it is about increasing the client readtimeout from 300s to 3600s.
it seems working. one server alredy backup the other one the bigest is running. i will let you know;
zakou
06-23-2015 02:44 AM
Hi,
once again backup failed with error 42 see details below
23/06/2015 09:56:27 - Info nbjm(pid=5052) starting backup job (jobid=4013) for client bneniamfp01.bdom.ad.corp, policy Win_Diff_Full_Export, schedule Diff_Journaliere
23/06/2015 09:56:27 - Info nbjm(pid=5052) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=4013, request id:{818DA035-D99B-42BD-B289-CED048D12516})
23/06/2015 09:56:27 - requesting resource bneniamas04-hcart3-robot-tld-0
23/06/2015 09:56:27 - requesting resource bneniamas04.NBU_CLIENT.MAXJOBS.bneniamfp01.bdom.ad.corp
23/06/2015 09:56:27 - requesting resource bneniamas04.NBU_POLICY.MAXJOBS.Win_Diff_Full_Export
23/06/2015 09:56:27 - granted resource bneniamas04.NBU_CLIENT.MAXJOBS.bneniamfp01.bdom.ad.corp
23/06/2015 09:56:27 - granted resource bneniamas04.NBU_POLICY.MAXJOBS.Win_Diff_Full_Export
23/06/2015 09:56:27 - granted resource ANC064
23/06/2015 09:56:27 - granted resource HP.ULTRIUM3-SCSI.001
23/06/2015 09:56:27 - granted resource bneniamas04-hcart3-robot-tld-0
23/06/2015 09:56:28 - estimated 12604406 Kbytes needed
23/06/2015 09:56:28 - Info nbjm(pid=5052) started backup (backupid=bneniamfp01.bdom.ad.corp_1435046187) job for client bneniamfp01.bdom.ad.corp, policy Win_Diff_Full_Export, schedule Diff_Journaliere on storage unit bneniamas04-hcart3-robot-tld-0
23/06/2015 09:56:28 - started process bpbrm (6816)
23/06/2015 09:56:33 - Info bpbrm(pid=6816) bneniamfp01.bdom.ad.corp is the host to backup data from
23/06/2015 09:56:33 - Info bpbrm(pid=6816) reading file list from client
23/06/2015 09:56:33 - connecting
23/06/2015 09:56:36 - Info bpbrm(pid=6816) starting bpbkar32 on client
23/06/2015 09:56:36 - connected; connect time: 00:00:03
23/06/2015 09:56:39 - Info bpbkar32(pid=3932) Backup started
23/06/2015 09:56:39 - Info bptm(pid=3540) start
23/06/2015 09:56:39 - Info bptm(pid=3540) using 65536 data buffer size
23/06/2015 09:56:39 - Info bptm(pid=3540) setting receive network buffer to 263168 bytes
23/06/2015 09:56:39 - Info bptm(pid=3540) using 30 data buffers
23/06/2015 09:56:39 - Info bptm(pid=3540) start backup
23/06/2015 09:56:39 - Info bptm(pid=3540) backup child process is pid 7056.3592
23/06/2015 09:56:39 - Info bptm(pid=3540) Waiting for mount of media id ANC064 (copy 1) on server bneniamas04.
23/06/2015 09:56:39 - Info bptm(pid=7056) start
23/06/2015 09:56:39 - mounting ANC064
23/06/2015 09:57:36 - Info bptm(pid=3540) media id ANC064 mounted on drive index 1, drivepath {4,0,4,0}, drivename HP.ULTRIUM3-SCSI.001, copy 1
23/06/2015 09:57:36 - mounted; mount time: 00:00:57
23/06/2015 09:57:39 - positioning ANC064 to file 59
23/06/2015 09:58:34 - positioned ANC064; position time: 00:00:55
23/06/2015 09:58:34 - begin writing
23/06/2015 10:16:38 - Critical bpbrm(pid=6816) from client bneniamfp01.bdom.ad.corp: FTL - socket write failed
23/06/2015 10:17:28 - Info bptm(pid=3540) EXITING with status 42 <----------
23/06/2015 10:17:28 - Error bpbrm(pid=6816) could not send server status message
23/06/2015 10:17:31 - end writing; write time: 00:18:57
network read failed(42)
23/06/2015 10:57:05 - job 4013 was restarted as job 4015
06-23-2015 06:40 AM
please can soma one help?
06-23-2015 07:10 AM
Not enough information:
Is there a firewall in the environment ?
Did this every work, if so, has something changed ?
What have you changed to and fix this, apart from the client timeout setting
I would get the following logs:
(Media server)
bptm
bpbrm
(Client)
bpbkar
All at VERBOSE=5 and GENERAL=2 levels.
+ the Activity Monitor details for the failing jobs. All logs MUST cover the same time period, and of course, a job that fails.
Just based on the limited details above, you appear to have a network issue, not a NetBackup issue.
Critical bpbrm(pid=6816) from client bneniamfp01.bdom.ad.corp: FTL - socket write failed
'Network' issue 'includes' the TCP stack in the OS, so maybe not the 'physical' network - it also includes routers and switches, firewalls and NIC cards, and everything inbetween ...
It doesn't include NetBackup ... NBU sits on the top, and only 'uses' what it is given. It, at least in the majority of cases, is simply a victim of a network issue.
06-23-2015 08:05 AM
I changed only the client timeout setting.
I have the some error with another server where i did not od anything change.
06-23-2015 08:08 AM
Also not that the one server that get the error is an VM. there are others server on the same ESX for that the backup is OK; but not for this server.
06-23-2015 08:19 AM
06-23-2015 08:55 AM
@sdo:
06-23-2015 08:58 AM
@Marianne:
the backup in doing trhough the network.
06-23-2015 02:31 PM
One thing to check - make sure that the NetBackup Client 'tracker' process/agent is not running on the client - this is a nasty little program which, if it is running, first goes off and scans the entire file system being backed-up, so that it can track file counts, before letting the backup run for real. Normally no-one enables this process - but someone may have done. This is a rare cause of timeouts.
A probably equally rare cause of status 42, is having LAN switches set in a slightly agressive mode, whereby... if, after a certain period of time has elapsed AND the LAN switch has not seen any IP traffic on a point-to-point TCP conversation... then the LAN switch itself will close down the conversation.
.
Other things to try on really large Windows file systems:
1) Run a read-only chkdsk - is the file system error free?
2) Start / Run / CMD, then cd /d X:\, the do dir /s /b > a.lis, does a directory walk fail/hang? How many folders and files are there?
3) How badly fragmented is the file system? defrag /v /u /h /a X: (where X: is the problem drive letter). If it's bad, maybe spend some weekends running a manual defrag, and then cancelling it on a Monday morning - it may take several weekends before it actually manages to complete normally.
4) Try setting the 'client attributes' of the problem client - on the 'Windows Open File Backup' tab, in the 'Snapshot error control' box, try setting the 'Disable Snapshot and continue' radio button.
5) Check which drive/volume the problem drive/volume is using for it's VSS shadow space? Is it using itself? I hope not. Also, if the drive being backed-up is receiving a fair amount of writes during backups then maybe your 'shadow space' is too small? Try using vssadmin to set the shadow space for the problem drive/volume to use another different drive/volume as its shadow space? Also, is the shadow space 'bounded' with a limit? Is it too small?
6) Are any application or system events, which may be applicable to problems with the file system, seen in the Windows event logs?
7) Is this really big drive actually a mounted CIFS/SMB volume from somewhere else?
8) How much free space does the problem drive/volume actually have?
06-24-2015 01:50 AM
@sdo
hi what is the real name of "NetBackup Client 'tracker' process/agentor"
how can I know if this process is running, I checked by the task manger but did not see anything.
06-24-2015 02:07 AM
Why not right click and properties of the 'NetBackup Client Job Tracker' in the start menu program group, to show:
"C:\Program Files\Veritas\NetBackup\bin\tracker.exe"
06-24-2015 02:24 AM
@sdo
yes i get it what is the next step? please
06-24-2015 02:28 AM
Ask your network admins whether their switches close 'apparently idle' TCP conversations?
06-24-2015 02:28 AM
So... the 'tracker' was not running?
06-24-2015 02:56 AM
.... what is the next step?
There have been many suggestions in the last 24 hours.
mph999 has asked you to create log folders. Have you done so?
He also explained that network issues must be checked outside of NBU - NBU is simply a victim of a network issue.
I have asked that you get your network team involved to monitor connections/traffic between client and media server.
Have you done so?
Apart from the 'tracker' question, sdo has also asked 8 other questions.
I do not see answers to any of them?
Honestly - there is no 'magic NBU setting' to fix network issues.
NBU is reporting the issue, not causing it.
06-25-2015 02:19 AM
Hi All,
the speed at the vmnic was at 100 full at ESX side. I forced it to 1000 full. Now the backup is running since tow hour and half without failed. hope it will be good.
NetBackup version? Master, and media, and client? >>>>NBU7.5.0.6
Activity monitor log to start with?>>>>i did not get you
Does any data get moved? >>>>>NO
Does it always fail at the same point?>>>>>NO
What steps have you tried so far to resolve the issue?>>>>CLient timeout setting
Did it ever work?>>>>YES
Is SP1 installed on the client?>>>>Yes