08-22-2011 12:09 AM
Hi,
I have just got error: "Error bpbrm(pid=5764) db_FLISTsend failed: premature eof encountered (233)" on 3 different Netbackup servers.
They are all Netbackup 6.5.6 on x86 Windows 2003 R2 SP2 Sandard (patched up to date).
Can anybody help me with this?
Tomasz
08-22-2011 01:28 AM
If this occured at the same time, then the connection to the master server quite likely was interrupted. bpbrm will try to send the list of backed up files continously during a backup, and if the TCP socket connection for some reason is broken, you could very well receive this I think.
The other scenario could be circular links in file systems:
http://www.symantec.com/business/support/index?page=content&id=TECH39252
Or NDMP backups which are huge;
http://www.symantec.com/business/support/index?page=content&id=TECH75266
This one is actually true for FlashBackup as well...
/A
08-22-2011 05:05 AM
This happened in 3 different countries over last weekend. (different time). (3 totally independent netbackup domains)
This can not be problem with circular links (all client servers are windows servers). We don't use NDMP backup.
When I try to start new jobs - they keep failing with 233 error. Looks like major bug.
08-22-2011 05:34 AM
What about policy names?
STATUS CODE: 233, Backups will fail with a status 233 if the backup is run via a class or policy which has a comma sign (,) in its name.
08-22-2011 06:08 AM
There is only one undercore sign in policy name. no comma sings.
We had a patching session last weekend.
The following patches have been installed:
KB2570222, KB2567680, KB2566454, KB2559049, KB2555917. Maybe patching session caused problems..
08-22-2011 02:24 PM
Try expanding the backup window by an hour on that policy to see if that fixes the end of file error.
08-23-2011 12:27 AM
For now I have created <Install Drive>:\Program Files\Veritas\NetBackup\MAX_FILES_PER_ADD file and put value 15000 inside.
I have also lenghten backup window (however I am sceptical how this could help here)
Job is restarted now and I am waiting....
08-23-2011 02:48 AM
No luck,
Job failed again:
Here is relevant part of bptm log:
11:19:15.697 [7064.1296] <2> get_long: (2) premature end of file (byte 1)
11:19:15.697 [7064.1296] <2> db_getdata: get_string() failed: Access is denied. (5) premature end of file encountered (-9) WSAGetLastError(): 0
11:19:15.697 [7064.1296] <2> db_end_sts: no DONE from db_getreply(): premature eof encountered
11:19:15.713 [7064.1296] <16> write_backup: cannot add fragment to image database, error = premature eof encountered
And part of bpbrm:
1:19:15.807 [7536.4640] <2> get_long: (2) premature end of file (byte 1)
11:19:15.807 [7536.4640] <2> job_monitoring_exex: Failed to get frame type
11:19:15.807 [7536.4640] <2> job_disconnect: Disconnected
11:19:15.963 [7536.4640] <2> bpbrm handle_backup: client <clientserver> EXIT STATUS = 24: socket write failed
11:19:15.963 [7536.4640] <2> inform_client_of_status: INF - Server status = 24
11:19:16.041 [7468.864] <2> bpbrm read_media_msg: read from media manager: EXIT <clientserver>_1314085738 233
11:19:16.041 [7468.864] <2> bpbrm process_media_msg: media manager for backup id <clientserver>_1314085738 exited with status 233: premature eof encountered
11:19:16.041 [7468.864] <2> bpbrm signal_bpbrm_child: sending Abnormal Exit to bpbrm child 7536
11:19:16.541 [7468.864] <2> bpbrm brm_child_done: child done, status 24
11:19:16.541 [7468.864] <2> bpbrm brm_child_done: child 7536 exited with status 24: socket write failed
I have replaced proper client server name with <clientserver>
Any ideas?
08-23-2011 03:33 AM
we need few more details ....
is 233 error imidiately reflecting after restarting job ??
what amount of data you tring to backup from that perticular client ??
bacause few months back i faced this situation and also worked as you mentioned MAX_FILES_PER_ADD
i was backing up almost 1.8 TB data from NAS (NDMP Policy type) & after creating above file, issue got resolved (almost 90 %) after that i also divided backup in to multiple streams (30 streams ) & issue resolved 98%
08-23-2011 04:55 AM
Server1 - master server which backs up itself (about 300GB). I started job today at 9:19AM. It failed after 3 hours at 12:37PM. Restarted automatically, failed again at 1:42PM and then again at 2:13PM. Always with the same 233 error.
Server2- master server backing up another windows file server. (about 1TB in total). Partitions C and D which are small (20GB) finish fine. Bigger partitions: E and F (650GB and 300GB respectively) fail after some time with 233 error.
We do not use NDMP. This is standard Windows-NT policy.
I have just wanted to try with MAX_FILES_PER_ADD but really I was not expecting too much.
08-23-2011 05:27 AM
11:19:15.713 [7064.1296] <16> write_backup: cannot add fragment to image database, error = premature eof encountered
Check your db\images ... are you running out of space? Some kind of permission issue? Are you using an NFS/share for your database?
08-23-2011 05:36 AM
Can you change that MAX_FILES_PER_ADD value from 15000 to 32000 ? (just for testing purpose).
08-23-2011 05:39 AM
Netbackup database is on local disk. There is plenty of free space. Permissions on "D:\Program Files\VERITAS\NetBackup\db\images" looks ok (nobody changed them):
administrators, creator/owner, SYSTEM - FULL access
network service/Users - read, read&execute, list folder contents access.
08-23-2011 11:35 PM
I decided to uninstall patches that were applied last weekend. If this does not help then I assume that somehow the netbackup db or image files had become corrupted and I will try to reinstall netbackup from scratch.
I will update this topic later.
ps Are there any tools for checking Netbackup db/images consistency?
08-24-2011 12:01 AM
For some consistency checking you can use bpdbm -consistency [1 | 2]
Use 1 or 2 depending on how detailed check you want.
/A
09-04-2011 11:33 PM
We have uninstalled windows patches on one of the servers and backup finally went fine.
Here is the list of patches:
KB2570222
KB2567680
KB2566454
KB2559049
KB2507938
KB2555917
09-05-2011 01:26 AM
thnkx to update post with solution ....
i hope this will helps others
09-05-2011 01:57 AM
All patches are security patches and does fiddle with the core parts and networking, and it seems KB2555917 does conflict with some other software as well (FireFox is mentioned when googling). I guess your best bet is to contact SYMC support in order to try to reproduce the problem in a lab system with the above mentioned patches.
/A