Forum Discussion

Eamonn's avatar
Eamonn
Level 3
14 years ago
Solved

Problems with backup, cannot connect on socket(25)

Master + Media servers: Netbackup 7.0.1, Windows Server 2003 R2

Clients: Netbackup 7.0.1, and 6.5.3, Windows Server 2008 (non-R2)

Getting consistent failures on 3 clients lately with error 25 on the child job, cannot connect on socket(25)

Happens with Full and Cumulative Incremental backups.

Have tried the following:

Rerun the jobs during the day (normally run at night) - Still fails

Created a brand new policy (not copying the old one) - Also fails

Telnet to BPCD port - Successful response from all servers to the clients

Adding the required network interface - Still fails

Changing the timeout connections - Still fails

Pinging/NSlookup/bptestbpcd from the master/media servers - Everything appears to be fine

Running bbps while the backups are starting I can see vnetd and bpcd start on the clients, but still it always fails with cannot connect on socket(25)

Any suggestions on things to check? I will post logs below.

  • So I ran one of the backups with the BPCD log on, and found the following (which wasn't in the bptestbpcd results one):

    <16> bpcd main: get_vnetd_forward_socket failed: 21

    Searching I found the following article:

    http://www.symantec.com/business/support/index?page=content&id=TECH156471&key=15143&actp=LIST

    Which advised the following:

      Host Properties > Master server > server > Client Attributes > client > Connect Options

     

      BPCD connect back = Random port

      Ports = Non-reserved ports

      Daemon connection port = Daemon port only

     

    Tested that, it worked.

    To test additionally I changed the setup to the following: 

      BPCD connect back = Default

      Ports = Default

      Daemon connection port = Daemon port only

    Which also worked.

    As soon as I switch Daemon connection port back to default we go back to the job failing.

    So going by that article I'll need to identify the non-NetBackup process that is randomly connecting, as leaving the setting as Daemon port only is not a desired setup for us.

    Looking further however I can't find anything connecting on the ports being requested in the vnetd log, so again I'm left with a puzzle of why this is failing on the default setting, especially considering 2 of the 3 affected clients were backing up without issue until last week, and no changes were applied to them in the time frame.

    Thank you both for the log commands, helped point me in the right direction, and find a work around until I get a solution in place.

  • By selecting the "Default connect Options" NBU uses 'no connect-back' protocol. Consider the below scenario where bpbrm wants to fork bpbkar in client.

    1. bpbrm process uses vnetd to initiate bpcd which forks new process (bpbkar)
    2. The bpbrm also provides the port number on which the bpbkar can respond to.
    3. Now when new process receives connection from bpcd, it connect to bpbrm via vnetd by specifying port number in "IPC STRING" thats is passed to vnetd.
    4. Now vnetd uses the "IPC STING" port to direct the connection to bpbrm.

    To be checked in your case:

    Please check vnetd logs for IPC Sting port during the backup and also find via "netstat -a -o" to know which 'non-netbackup' process is trying to connect to the port - that causes vnetd failure towars bpbrm.

    Hope this will shed some light on the issue.

  • 31-May-2011 11:06:46 AM - requesting resource sg-mslb-ms02-ms04
    31-May-2011 11:06:46 AM - requesting resource nbuprod.NBU_CLIENT.MAXJOBS.server01
    31-May-2011 11:06:46 AM - requesting resource nbuprod.NBU_POLICY.MAXJOBS.ExchangeHUBSystems
    31-May-2011 11:06:47 AM - granted resource nbuprod.NBU_CLIENT.MAXJOBS.server01
    31-May-2011 11:06:47 AM - granted resource nbuprod.NBU_POLICY.MAXJOBS.ExchangeHUBSystems
    31-May-2011 11:06:47 AM - granted resource V70980
    31-May-2011 11:06:47 AM - granted resource Drive000
    31-May-2011 11:06:47 AM - granted resource mediaserver02-hcart-robot-tld-0
    31-May-2011 11:06:47 AM - estimated 2569834 Kbytes needed
    31-May-2011 11:07:01 AM - started process bpbrm (21176)
    31-May-2011 11:10:06 AM - mounting V70980
    31-May-2011 11:10:37 AM - mounted; mount time: 00:00:31
    31-May-2011 11:10:37 AM - positioning V70980 to file 213
    31-May-2011 11:14:03 AM - positioned V70980; position time: 00:03:26
    31-May-2011 11:14:03 AM - connecting
    31-May-2011 11:15:11 AM - Error bpbrm(pid=22040) cannot create data socket, The operation completed successfully.  (0)  
    31-May-2011 11:16:02 AM - mounted
    31-May-2011 11:17:08 AM - positioning V70980 to file 213
    31-May-2011 11:17:08 AM - positioned V70980; position time: 00:00:00
    31-May-2011 11:19:41 AM - mounted
    31-May-2011 11:19:41 AM - positioning V70980 to file 213
    31-May-2011 11:19:41 AM - positioned V70980; position time: 00:00:00
    31-May-2011 11:20:13 AM - end writing
    cannot connect on socket(25)
     

  • 11:11:56.743 [22040.21044] <2> bpbrm main: INITIATING (VERBOSE = 0): version NetBackup 7.0 0
    11:11:56.743 [22040.21044] <2> logparams: -backup -S nbuprod -c server01 -ct 13 -ru root -cl ExchangeHUBSystems -sched full -bt 1306865207 -dt 0 -st 0 -b server01_1306865207 -mediasvr mediaserver02 -jobid 1367298 -jobgrpid 1367296 -masterversion 700000 -maxfrag 1048576 -reqid -1305862761 -mt 2 -to 0 -stunit mediaserver02-hcart-robot-tld-0 -rl 4 -rp 5356800 -cj 4 -D 6 -rt 8 -rn 0 -pool ProdData -mpx 6 -use_ofb -use_otm -WOFB_enabled 1 -WOFB_fim 1 -nbu -jm -secure 1 -brmchild 0 0 0 1 568 START BACKUP -c server01 -ct 13 -ru root -cl ExchangeHUBSystems -sched full -bt 1306865208 -dt 0 -st 0 -b server01_1306865208 -jobid 1367299 -jobgrpid 1367296 -rl 4 -rp 5356800 -use_ofb -use_otm -WOFB_enabled 1 -WOFB_fim 1 -nbu -secure 1 -kl 7 -rg other -fso -stream_count 3 -stream_number 2 -ckpt_time 900 -connect_options 16974338 31604;a2fb05ce1815da54e8697b3f796e3ecf;4;300
    11:11:56.743 [22040.21044] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbuprod, query type 223
    11:11:56.758 [22040.21044] <2> vnet_pbxConnect: pbxConnectEx Succeeded
    11:11:56.758 [22040.21044] <2> logconnections: BPDBM CONNECT FROM 10.193.0.237.2623 TO 10.193.0.247.1556 fd = 572
    11:11:56.883 [22040.21044] <2> db_CLIENTsend: reset client protocol version from 0 to 6
    11:11:57.055 [22040.21044] <2> db_end: Need to collect reply
    11:11:57.055 [22040.21044] <2> db_freeEXDB_INFO: ?
    11:12:17.087 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2196: 0: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
    11:12:17.087 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2210: 0: service: bpcd
    11:12:17.290 [22040.21044] <2> logconnections: BPCD CONNECT FROM 10.193.0.237.2629 TO 10.193.176.61.13724 fd = 548
    11:12:27.321 [22040.21044] <2> do_vnetd_service: ../../libvlibs/vnet_connect.c.1615: 0: connect: VNETD CONNECT FROM 10.193.0.237.2633 TO 10.193.176.61.13724 fd = 552
    11:12:27.321 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.540: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
    11:12:27.524 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.557: 0: ipc_string: 31614
    11:12:27.946 [22040.21044] <2> local_bpcr_connect: expected reserved port, got 13724
    11:12:28.071 [22040.21044] <2> local_getHostInfo: CLIENT_CMD_SOCK from bpcr = 548
    11:12:28.290 [22040.21044] <2> bpcr_get_version_rqst: bpcd version: 06530000
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.509 [22040.21044] <2> local_getHostInfo: hostname comparison failed
    11:12:28.728 [22040.21044] <2> bpcr_get_version_rqst: bpcd version: 06530000
    11:12:28.728 [22040.21044] <2> local_getHostInfo: rqstType == BPCD_GET_NB_VERSION_RQST
    11:12:29.165 [22040.21044] <2> bpcr_get_platform_rqst: Server client platform length = 7
    11:12:29.165 [22040.21044] <2> xlatePlatformName: my version = 700000, client version = 650000, check old list first
    11:12:29.165 [22040.21044] <2> xlatePlatformName: BUILD_PLATFORM name to xlate = win_x64
    11:12:29.165 [22040.21044] <2> xlatePlatformName: win_x64 not found in the first current list, try the second list
    11:12:29.165 [22040.21044] <2> local_getHostInfo: platform of client server01 is PC-x64, WindowsXP
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=28, buffer=UNAME_SYSNAME = Windows2008
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=28, buffer=UNAME_NODENAME = server01
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=18, buffer=UNAME_RELEASE = 6
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=18, buffer=UNAME_VERSION = 0
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=52, buffer=UNAME_MACHINE = Unknown - Unknown -  Revision x170a
    11:12:29.384 [22040.21044] <2> bpcr_get_uname_rqst: bpcr.c: 3777 count=3, buffer=END
    11:12:29.384 [22040.21044] <2> local_getHostInfo: h:\701\src\nb\dblib\dbmisc.c: 1730, uname -sr=Windows2008 6
    11:12:29.493 [22040.21044] <2> bpcr_get_version_rqst: bpcd version: 06530000
    11:12:29.931 [22040.21044] <2> getPatchLevel: CLIENT_CMD_SOCK from bpcr = 548
    11:12:30.149 [22040.21044] <2> getPatchLevel: got client socket
    11:12:30.149 [22040.21044] <2> getPatchLevel: file num = 3, file size = 111
    11:12:30.149 [22040.21044] <2> bpcr_read_rqst: completed send of file number 3
    11:12:30.149 [22040.21044] <2> bpcr_read_rqst: completed send of transfer_count = 16383
    11:12:30.149 [22040.21044] <2> bpcr_read_rqst: Before readclientstatus on socket 552
    11:12:30.368 [22040.21044] <2> bpcr_read_rqst: Num bytes to be sent by client = 111
    11:12:30.368 [22040.21044] <2> getPatchLevel: total chars read = 111
    11:12:30.368 [22040.21044] <2> getPatchLevel: level = >6.5.3<
    11:12:30.712 [22040.21044] <2> bpbrm create_events: created suspend event name Global\NetBackup Suspend Event, pid: 22040
    11:12:30.712 [22040.21044] <2> bpbrm process_start_backup_msg: processing START BACKUP message.
    11:12:30.712 [22040.21044] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbuprod, query type 223
    11:12:30.712 [22040.21044] <2> vnet_pbxConnect: pbxConnectEx Succeeded
    11:12:30.712 [22040.21044] <2> logconnections: BPDBM CONNECT FROM 10.193.0.237.2637 TO 10.193.0.247.1556 fd = 528
    11:12:30.853 [22040.21044] <2> db_CLIENTsend: reset client protocol version from 0 to 6
    11:12:31.071 [22040.21044] <2> db_end: Need to collect reply
    11:12:31.071 [22040.21044] <2> db_freeEXDB_INFO: ?
    11:12:51.103 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2196: 0: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
    11:12:51.103 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2210: 0: service: bpcd
    11:12:51.306 [22040.21044] <2> logconnections: BPCD CONNECT FROM 10.193.0.237.2642 TO 10.193.176.61.13724 fd = 512
    11:13:01.338 [22040.21044] <2> do_vnetd_service: ../../libvlibs/vnet_connect.c.1615: 0: connect: VNETD CONNECT FROM 10.193.0.237.2645 TO 10.193.176.61.13724 fd = 572
    11:13:01.338 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.540: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
    11:13:01.541 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.557: 0: ipc_string: 31633
    11:13:02.041 [22040.21044] <2> local_bpcr_connect: expected reserved port, got 13724
    11:13:02.088 [22040.21044] <2> bpbrm start_bpcd_stat: DATA_SOCK from bpcr = 512
    11:13:02.088 [22040.21044] <2> bpbrm start_bpcd_stat: NAME_SOCK from bpcr = 572
    11:13:02.088 [22040.21044] <2> local_getHostInfo: found TSS cached host server01 type 67
    11:13:02.088 [22040.21044] <2> bpbrm read_filelist: reading filelist.
    11:13:02.088 [22040.21044] <2> bpbrm read_filelist: filepath from bpsched: FILEPATH server01_1306865208 SET SNAP_ID=server01_1306865002
    11:13:22.135 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2196: 0: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
    11:13:22.135 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2210: 0: service: bpcd
    11:13:22.322 [22040.21044] <2> logconnections: BPCD CONNECT FROM 10.193.0.237.2650 TO 10.193.176.61.13724 fd = 540
    11:13:32.354 [22040.21044] <2> do_vnetd_service: ../../libvlibs/vnet_connect.c.1615: 0: connect: VNETD CONNECT FROM 10.193.0.237.2656 TO 10.193.176.61.13724 fd = 504
    11:13:32.354 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.540: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
    11:13:32.557 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.557: 0: ipc_string: 3477
    11:13:32.916 [22040.21044] <2> local_bpcr_connect: expected reserved port, got 13724
    11:13:33.260 [22040.21044] <2> bpcr_get_version_rqst: bpcd version: 06530000
    11:13:33.260 [22040.21044] <2> bpbrm read_filelist: filepath from bpsched: FILEPATH server01_1306865208 BACKUP D:\Logs USING \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy66\Logs OPTIONS:ALT_PATH_PREFIX=C:\Program Files\Veritas\NetBackup\temp\_vrts_frzn_img_22048,FITYPE=MIRROR,MNTPOINT=D:\,FSTYPE=NTFS
    11:13:33.260 [22040.21044] <2> bpbrm read_filelist: filepath from bpsched: CONTINUE server01_1306865208
    11:13:53.292 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2196: 0: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
    11:13:53.292 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2210: 0: service: bpcd
    11:13:53.495 [22040.21044] <2> logconnections: BPCD CONNECT FROM 10.193.0.237.2666 TO 10.193.176.61.13724 fd = 512
    11:14:03.526 [22040.21044] <2> do_vnetd_service: ../../libvlibs/vnet_connect.c.1615: 0: connect: VNETD CONNECT FROM 10.193.0.237.2671 TO 10.193.176.61.13724 fd = 572
    11:14:03.526 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.540: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
    11:14:03.714 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.557: 0: ipc_string: 3492
    11:14:04.089 [22040.21044] <2> local_bpcr_connect: expected reserved port, got 13724
    11:14:04.433 [22040.21044] <2> bpcr_get_version_rqst: bpcd version: 06530000
    11:14:04.433 [22040.21044] <2> bpcr_get_charset_rqst: bpcd_version 0x06530000 less than 0x07000000
    11:14:04.433 [22040.21044] <2> bpbrm send_bpsched_connecting_msg: sending bpsched msg: CONNECTING TO CLIENT FOR server01_1306865208
    11:14:24.464 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2196: 0: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
    11:14:24.464 [22040.21044] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2210: 0: service: bpcd
    11:14:24.683 [22040.21044] <2> logconnections: BPCD CONNECT FROM 10.193.0.237.2683 TO 10.193.176.61.13724 fd = 540
    11:14:34.714 [22040.21044] <2> do_vnetd_service: ../../libvlibs/vnet_connect.c.1615: 0: connect: VNETD CONNECT FROM 10.193.0.237.2697 TO 10.193.176.61.13724 fd = 564
    11:14:34.714 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.540: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
    11:14:34.917 [22040.21044] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.557: 0: ipc_string: 3506
    11:14:35.246 [22040.21044] <2> local_bpcr_connect: expected reserved port, got 13724
    11:14:35.496 [22040.21044] <2> bpbrm start_bpcd_stat: DATA_SOCK from bpcr = 540
    11:14:35.496 [22040.21044] <2> bpbrm start_bpcd_stat: NAME_SOCK from bpcr = 564
    11:14:35.496 [22040.21044] <2> bpbrm handle_backup: calling bpcr_get_socket_rqst3.
    11:15:12.731 [22040.21044] <2> bpcr_put_vnetd_forward_socket: bpcd returned failure: 21
    11:15:12.731 [22040.21044] <2> bpcr_ipc_vnetd_socket_rqst: bpcr_put_vnetd_forward_socket failed: 21
    11:15:12.731 [22040.21044] <16> bpbrm handle_backup: cannot create data socket, The operation completed successfully.  (0)
    11:15:12.746 [22040.21044] <2> vnet_pbxConnect: pbxConnectEx Succeeded
    11:15:12.746 [22040.21044] <2> job_connect: SO_KEEPALIVE set on socket 488 for client nbuprod
    11:15:12.746 [22040.21044] <2> logconnections: BPJOBD CONNECT FROM 10.193.0.237.2711 TO 10.193.0.247.1556 fd = 488
    11:15:12.746 [22040.21044] <2> job_authenticate_connection: ignoring VxSS authentication check for now...
    11:15:12.746 [22040.21044] <2> job_connect: Connected to the host nbuprod contype 10 jobid <1367299> socket <488>
    11:15:12.746 [22040.21044] <2> job_connect: Connected on port 2711
    11:15:12.949 [22040.21044] <2> job_monitoring_exex: ACK disconnect
    11:15:12.949 [22040.21044] <2> job_disconnect: Disconnected
    11:15:12.949 [22040.21044] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbuprod, query type 1
    11:15:12.949 [22040.21044] <2> vnet_pbxConnect: pbxConnectEx Succeeded
    11:15:12.949 [22040.21044] <2> logconnections: BPDBM CONNECT FROM 10.193.0.237.2712 TO 10.193.0.247.1556 fd = 572
    11:15:13.090 [22040.21044] <2> db_end: Need to collect reply
     

  • Enable bpcd log in the client and run the command and post the output

    bptestbpcd -client <client_name> -verbose -debug

  • "Adding the required network interface"

    Does this mean that you have a separate backup network?

    Did you add separate hostnames for multiple NICs in hosts file on client itself?

    How does media server resolve client hostname and IP for backup NIC? Hosts entries? DNS? Reverse lookup enabled on DNS?

    Please create bpcd log directory on client, then post output of the following commands:

    On client:
    bpclntcmd -self
    bpclntcmd -hn <media-server-hostname>
    bpclntcmd -ip <media-server-ip>

    On media server:
    bpclntcmd -hn <client-hostname>  (use hostname and IP associated with backup NIC)
    bpclntcmd -ip <client-ip>
    bptestbpcd -client <client-hostname> -verbose
     

    Please post client's bpcd log file as attachment.

  • Sorry what I meant was I had tried setting the REQUIRED_INTERFACE registry entry (via the GUI option under host properties -> Universal Settings -> Use Specified Network Interface). Just mentioning it as another step I tried. There isn't a seperate backup network.

    DNS for resolution of hostnames.

    From Client:

    C:\>bpclntcmd -self
    gethostname() returned: server01
    host server01: server01.domain.com at 10.193.176.51 (0x33b0c10a)
    host server01: server01.domain.com at 10.193.176.31 (0x1fb0c10a)
    host server01: server01.domain.com at 10.193.168.20 (0x14a8c10a)
    host server01: server01.domain.com at 10.193.176.61 (0x3db0c10a)
    host server01: server01.domain.com at 10.193.168.30 (0x1ea8c10a)
    aliases:

    C:\>bpclntcmd -hn mediaserver02
    host mediaserver02: mediaserver02.domain.com at 10.193.0.237 (0xed00c10a)
    aliases:

    C:\>bpclntcmd -ip 10.193.0.237
    checkhaddr: host   : mediaserver02: mediaserver02.domain.com at 10.193.0.237 (0xed00c10a)

    checkhaddr: aliases:

     

    From Media Server:

    C:\>bpclntcmd -hn server01
    host server01: server01.domain.com at 10.193.176.61
    host server01: server01.domain.com at 10.193.176.51
    aliases:     server01.domain.com     server01     10.193.176.61     10.193.176.51
    51

    C:\>bpclntcmd -ip 10.193.176.61
    host 10.193.176.61: server01.domain.com at 10.193.176.61
    host 10.193.176.61: server01.domain.com at 10.193.176.51
    aliases:     server01.domain.com     10.193.176.61     10.193.176.51

    C:\>bptestbpcd -client server01 -verbose
    1 1 1
    10.193.0.237:2011 -> 10.193.176.61:13724
    10.193.0.237:2014 -> 10.193.176.61:13724
    PEER_NAME = mediaserver02.domain.com
    HOST_NAME = server01
    CLIENT_NAME = server01
    VERSION = 0x06530000
    PLATFORM = win_x64
    PATCH_VERSION = 6.5.3.0
    SERVER_PATCH_VERSION = 6.5.3.0
    MASTER_SERVER = nbuprod
    EMM_SERVER = nbuprod
    10.193.0.237:2018 -> 10.193.176.61:13724

  • So I ran one of the backups with the BPCD log on, and found the following (which wasn't in the bptestbpcd results one):

    <16> bpcd main: get_vnetd_forward_socket failed: 21

    Searching I found the following article:

    http://www.symantec.com/business/support/index?page=content&id=TECH156471&key=15143&actp=LIST

    Which advised the following:

      Host Properties > Master server > server > Client Attributes > client > Connect Options

     

      BPCD connect back = Random port

      Ports = Non-reserved ports

      Daemon connection port = Daemon port only

     

    Tested that, it worked.

    To test additionally I changed the setup to the following: 

      BPCD connect back = Default

      Ports = Default

      Daemon connection port = Daemon port only

    Which also worked.

    As soon as I switch Daemon connection port back to default we go back to the job failing.

    So going by that article I'll need to identify the non-NetBackup process that is randomly connecting, as leaving the setting as Daemon port only is not a desired setup for us.

    Looking further however I can't find anything connecting on the ports being requested in the vnetd log, so again I'm left with a puzzle of why this is failing on the default setting, especially considering 2 of the 3 affected clients were backing up without issue until last week, and no changes were applied to them in the time frame.

    Thank you both for the log commands, helped point me in the right direction, and find a work around until I get a solution in place.