cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup 6.0 MP7 Strange Behaviour

kuzack
Not applicable
Hi,

We have Netbackup 6.0 MP7 Server installed on RHEL 5.2  with following configuration:

- one master/media server which is a client too, connected to TL Dell ML 6010
- one client in DMZ (behind firewall)

Problem is very, very strange. Everything works OK for few days (usually 2-3) and then only catalog backup policy fails, after all other jobs are done successfully.
Sometimes it is enough just to start Catalog backup policy manual after few hours and it will finish OK. Sometimes I must restart complete Netbackup (bp.kill_all - .../goodies/netbackup)
Most of data which is backed up is on master server (client) and there is no problem with back up. there is no problem with data from client in DMZ. Problems are always related with Catalog Backup and these errors are reported:


06/29/2009 08:00:19 - requesting resource Any
06/29/2009 08:00:19 - requesting resource thor.NBU_CLIENT.MAXJOBS.thor
06/29/2009 08:00:19 - requesting resource thor.NBU_POLICY.MAXJOBS.Catalog_Backup
06/29/2009 08:00:19 - granted resource thor.NBU_CLIENT.MAXJOBS.thor
06/29/2009 08:00:19 - granted resource thor.NBU_POLICY.MAXJOBS.Catalog_Backup
06/29/2009 08:00:19 - granted resource 0034L3
06/29/2009 08:00:19 - granted resource IBM.ULTRIUM-TD3.001
06/29/2009 08:00:19 - granted resource thor-hcart3-robot-tld-0
06/29/2009 08:00:21 - started process bpbrm (pid=18908)
06/29/2009 08:00:21 - connecting error bpbrm (pid=18908) cannot connect to thor, Cannot assign requested address (99)
06/29/2009 08:00:30 - connected; connect time: 0:00:00
06/29/2009 08:00:31 - end writing
can't connect to client (58)

It looks like master server cannot connect to itself??? And that happens every 2-3 days??



Master server bp.conf

SERVER = thor
CLIENT_NAME = thor
EMMSERVER = thor
VXDBMS_NB_DATA = /usr/openv/db/data
SERVER_SENDS_MAIL = YES
KEEP_JOBS_HOURS = 720
REQUIRED_INTERFACE = 192.168.4.10
BPSTART_TIMEOUT = 600
BPEND_TIMEOUT = 600

SERVER_RESERVED_PORT_WINDOW = 800 899
CLIENT_RESERVED_PORT_WINDOW = 900 999
SERVER_PORT_WINDOW = 5000 5500
CLIENT_PORT_WINDOW = 4800 4999
ALLOW_NON_RESERVED_PORTS
BPBRM_VERBOSE = 2


I looked in Symantec knowledge base but I can't find anything similar... If anyone saw problems like this and know how to solve problem or has some hints I would be very gratefull :)

Excerpt from bpbrm log (failed):

08:00:30.844 [18908] <2> bpbrm listen_for_client: DATA_SOCK for client = 5
08:00:30.844 [18908] <2> bpbrm listen_for_client: NAME_SOCK for client = 6
08:00:30.884 [18908] <2> bpbrm main: from client thor: INF - START vxbsa_file_backup 1 bpbkar -L /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246
255215.18878.prog.pcb_std -IEL -PCB /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246255215.18878.files.1 -ct 7 -dt 0 -to 0 -read_to 300 -clnt thor -class Catalog_Backup -sched Full -st FULL -bpstart_to 600 -bpend_to 600 -keyword NBDB:233:1246255202:F -bt 1246255219 -fso -b thor_1246255219 -kl 28
08:00:30.884 [18908] <2> bpbrm write_continue_backup: wrote CONTINUE BACKUP on COMM_SOCK <5>
08:00:30.884 [18908] <2> bpbrm write_filelist: wrote /usr/openv/db/staging/EMM_DATA.db on COMM_SOCK
08:00:30.884 [18908] <2> bpbrm write_filelist: wrote CONTINUE on COMM_SOCK
08:00:30.884 [18908] <2> hosts_equal: Comparing hosts <thor> and <thor>
08:00:30.884 [18908] <2> hosts_equal: names are the same
08:00:31.053 [18908] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
08:00:31.054 [18908] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpcd
08:00:31.054 [18908] <2> logconnections: BPCD CONNECT FROM 192.168.4.10.4995 TO 192.168.4.10.13724
08:00:31.054 [18908] <2> vnet_bind_and_connect_extra: vnet_bind.c.228: save_errno: 99 0x00000063
08:00:31.054 [18908] <2> vnet_bind_and_connect_extra: vnet_bind.c.228: save_errno: 99 0x00000063
08:00:31.054 [18908] <2> vnet_bind_and_connect_extra: vnet_bind.c.228: save_errno: 99 0x00000063
08:00:31.054 [18908] <2> vnet_bind_and_connect_extra: vnet_bind.c.228: save_errno: 99 0x00000063
08:00:31.054 [18908] <2> vnet_bind_and_connect_extra: vnet_bind.c.280: Function failed: 45 0x0000002d
08:00:31.054 [18908] <2> vnet_connect_to_vnetd_extra: vnet_vnetd.c.174: status: 45 0x0000002d
08:00:31.054 [18908] <2> vnet_connect_to_vnetd: vnet_vnetd.c.148: Function failed: 45 0x0000002d
08:00:31.054 [18908] <2> nb_vnetd_connect: comm.c.1828: vnet_connect_to_vnetd failed: 45
08:00:31.054 [18908] <2> bpcr_vnetd_connect_forward_socket_begin: nb_vnetd_connect(192.168.4.10) failed: 25
08:00:31.054 [18908] <2> bpcr_connect: bpcr_vnetd_connect_forward_socket_begin failed: 25
08:00:31.054 [18908] <2> ConnectToBPCD: bpcd_connect_and_verify(thor, thor) failed: 25
08:00:31.054 [18908] <16> bpbrm start_bpcd_stat: cannot connect to thor, Cannot assign requested address (99)
08:00:31.219 [18908] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
08:00:31.219 [18908] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpjobd
08:00:31.220 [18908] <2> logconnections: BPJOBD CONNECT FROM 192.168.4.10.4875 TO 192.168.4.10.13724
08:00:31.220 [18908] <2> job_authenticate_connection: ignoring VxSS authentication check for now...
08:00:31.220 [18908] <2> job_connect: Connected to the host thor contype 10 jobid <234> socket <7>
08:00:31.220 [18908] <2> job_connect: Connected on port 4875
08:00:31.260 [18908] <2> job_monitoring_exex: ACK disconnect
08:00:31.260 [18908] <2> job_disconnect: Disconnected
08:00:31.260 [18908] <2> db_error_add_to_file: dberrorq.c:midnite = 1246226400
08:00:31.297 [18908] <2> bpbrm start_bpcd: bpbrm.c.20167: start_bpcd_stat failed: 58 58 0x0000003a
08:00:31.297 [18908] <2> clear_held_signals: clearing signal mask stack, mask_stack_depth = 0
08:00:31.297 [18908] <2> bpbrm kill_child_process: start
08:00:31.370 [18908] <2> bpbrm Exit: client backup EXIT STATUS 58: can't connect to client


Excerpt from bpbrm log - manually started few hours later (successful):

10:00:36.168 [22931] <2> bpbrm listen_for_client: DATA_SOCK for client = 5
10:00:36.168 [22931] <2> bpbrm listen_for_client: NAME_SOCK for client = 6
10:00:36.209 [22931] <2> bpbrm main: from client thor: INF - START vxbsa_file_backup 1 bpbkar -L /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246
262421.22887.prog.pcb_std -IEL -PCB /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246262421.22887.files.1 -ct 7 -dt 0 -to 0 -read_to 300 -clnt tho
r -class Catalog_Backup -sched Full -st FULL -bpstart_to 600 -bpend_to 600 -keyword NBDB:235:1246262410:F -bt 1246262423 -fso -b thor_1246262423 -kl 28
10:00:36.209 [22931] <2> bpbrm write_continue_backup: wrote CONTINUE BACKUP on COMM_SOCK <5>
10:00:36.209 [22931] <2> bpbrm write_filelist: wrote /usr/openv/db/staging/EMM_DATA.db on COMM_SOCK
10:00:36.209 [22931] <2> bpbrm write_filelist: wrote CONTINUE on COMM_SOCK
10:00:36.209 [22931] <2> hosts_equal: Comparing hosts <thor> and <thor>
10:00:36.209 [22931] <2> hosts_equal: names are the same
10:00:36.382 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
10:00:36.382 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpcd
10:00:36.383 [22931] <2> logconnections: BPCD CONNECT FROM 192.168.4.10.4922 TO 192.168.4.10.13724
10:00:36.383 [22931] <2> vnet_connect_to_vnetd_extra: vnet_vnetd.c.178: msg: VNETD CONNECT FROM 192.168.4.10.4976 TO 192.168.4.10.13724 fd = 8
10:00:36.620 [22931] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.528: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
10:00:36.620 [22931] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.546: ipc_string: /tmp/vnet-22949246262436620267000000000-trHb68
10:00:36.702 [22931] <2> bpbrm start_bpcd_stat: DATA_SOCK from bpcr = 7
10:00:36.702 [22931] <2> bpbrm start_bpcd_stat: NAME_SOCK from bpcr = 8
10:00:36.702 [22931] <2> pfi_start_client: command = /usr/openv/netbackup/bin/bpbkar bpbkar -L /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246262421.22887.prog.pcb_std -IEL -PCB /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246262421.22887.files.1 -ct 7 -dt 0 -to 0 -read_to 300 -clnt thor -class Catalog_Backup -sched Full -st FULL -bpstart_to 600 -bpend_to 600 -keyword NBDB:235:1246262410:F -bt 1246262423 -fso -b thor_1246262423 -kl 28 -shm
10:00:36.702 [22931] <2> pfi_start_client: received bpcd success message
10:00:36.759 [22931] <2> pfi_start_client: read start message from thor, msg=<INF - BACKUP START>
10:00:36.759 [22931] <2> bpbrm main: from client thor: read client start message
10:00:36.759 [22931] <2> bpbrm spawn_child: /usr/openv/netbackup/bin/bptm bptm -w -c thor -den 20 -rt 8 -rn 0 -stunit thor-hcart3-robot-tld-0 -cl Catalog_Backup -bt 1246262423 -b thor_1246262423 -st 0 -cj 2 -p CatalogBackup -reqid -1245929387 -jm -brm -hostname thor -L /usr/openv/netbackup/logs/user_ops/dbext/logs/vxbsa.1246262421.22887.prog.pcb_std -ru root -rclnt thor -rclnthostname thor -rl 2 -rp 1814400 -sl Full -ct 7 -maxfrag 1048576 -v -mediasvr thor -no_callback -connect_options 0x01010100 -jobid 236 -jobgrpid 235 -masterversion 600000 -shm
10:00:36.760 [22931] <2> bpbrm write_continue_backup: wrote CONTINUE BACKUP on COMM_SOCK <7>
10:00:36.760 [22931] <2> write_file_names: buffering file name '/usr/openv/db/staging/EMM_DATA.db' for output
10:00:36.760 [22931] <2> write_file_names: successfully wrote buffer to COMM_SOCK
10:00:36.760 [22931] <2> bpbrm main: wrote CONTINUE on COMM_SOCK
10:00:36.760 [22931] <2> bpbrm main: closing DATA_SOCK
10:00:36.760 [22931] <2> bpbrm main: closing COMM_SOCK
10:00:36.760 [22931] <2> bpbrm main: ESTIMATE -1 -1 thor_1246262423
10:01:46.783 [22931] <2> bpbrm readline: retrying partial read from fgets ::
10:01:48.285 [22931] <2> bpbrm main: client thor EXIT STATUS = 0: the requested operation was successfully completed
10:01:48.465 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
10:01:48.465 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpdbm
10:01:48.465 [22931] <2> logconnections: BPDBM CONNECT FROM 192.168.4.10.4857 TO 192.168.4.10.13724
10:01:48.593 [22931] <2> bpbrm wait_for_child: start
10:01:53.933 [22931] <2> bpbrm wait_for_child: child exit_status = 0 signal_status = 0
10:01:53.933 [22931] <2> bpbrm wait_for_child: child exit normal
10:01:53.934 [22931] <2> bpbrm main: validating image for client thor
10:01:54.096 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
10:01:54.096 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpdbm
10:01:54.096 [22931] <2> logconnections: BPDBM CONNECT FROM 192.168.4.10.4853 TO 192.168.4.10.13724
10:01:54.483 [22931] <2> clear_held_signals: clearing signal mask stack, mask_stack_depth = 0
10:01:54.483 [22931] <2> bpbrm Exit: attempting to send mail to root on thor
10:01:54.483 [22931] <2> hosts_equal: Comparing hosts <thor> and <thor>
10:01:54.483 [22931] <2> hosts_equal: names are the same
10:01:54.762 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2033: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
10:01:54.762 [22931] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpcd
10:01:54.762 [22931] <2> logconnections: BPCD CONNECT FROM 192.168.4.10.4952 TO 192.168.4.10.13724
10:01:54.762 [22931] <2> vnet_connect_to_vnetd_extra: vnet_vnetd.c.178: msg: VNETD CONNECT FROM 192.168.4.10.4899 TO 192.168.4.10.13724 fd = 5
10:01:54.927 [22931] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.528: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
10:01:54.928 [22931] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.546: ipc_string: /tmp/vnet-23093246262514927452000000000-PxhqPJ
10:01:55.008 [22931] <2> bpbrm Exit: OUT_SOCK from bpcr = 4
10:01:55.008 [22931] <2> bpbrm Exit: IN_SOCK from bpcr = 5
10:01:55.008 [22931] <2> bpcr_get_version_rqst: bpcd version: 06000407
10:01:55.049 [22931] <2> bpcr_get_version_rqst: bpcd version: 06000407
10:01:55.053 [22931] <2> bpcr_get_version_rqst: bpcd version: 06000407
10:01:55.058 [22931] <2> bpbrm Exit: EXIT STATUS 0:
10:01:55.058 [22931] <2> bpbrm Exit: sending dbclient :<EXIT STATUS 0:>
10:01:55.065 [22931] <2> bpbrm Exit: client backup EXIT STATUS 0: the requested operation was successfully completed



1 REPLY 1

Android
Level 6
Partner Accredited Certified
I would start by checking the bpcd log on the master from the time period that this occurs.  If the directory does not exist create it.  Also make sure your logging is at level 5 in order to capture the most useful diagnostic info.  However at level 5 keep a close eye on your disk space as the logs can eat it quick (depending on how much free disk space you have).