cancel
Showing results for 
Search instead for 
Did you mean: 

bpduplicate memory errors on Media Server - advice appreciated!

Scott_Murray
Level 3
Hi all,
    In a NetBackup 5.1 (MP5) environment (all Win2003 Server), we have a number of Media Servers recently exhibiting a strange problem with scheduled DSSU de-staging (duplication) to locally-attached tape.
    When we set the DSSU "Manual Relocation" running (either manually or via a schedule), the parent process for the DSSU duplication begins quite happily, but a child process (duplication) never starts.
    If we look at the Media Server involved, it creates the correct "bidfile", but eventually generates a Windows error on the Media Server - such as (Application popup: bpduplicate.exe - Application Error : The instruction at "0x0040e260" referenced memory at "0x00000000". The memory could not be "read").
    We have investigated log files on Media and Master Server, gone through all Master Server Catalog images relating to that Media Server's clients. We have deleted and reloaded NetBackup on the Media Server, we have tried MP3, 4, 5 and 6 on the Media Server, and even changed out the physical RAM on the Media Server.
   The strange thing is that if we run the Duplication from the Master Server (via Catalog / Duplication of individual images), all works just fine. Similarly, it we take the Media Server's generated "bidfile", then from the Master Server we run the manual "bpduplicate" command with this correct "bidfile", then that all works just great as well !
    Seems to point to a problem at the Media Server side of things. Although the de-staging/duplication has been running happily on these Media Servers for about 18 months prior. And most of the other Media Servers on this same Master Server all run perfectly!
    Has anyone seen such an issue - or have any suggestions on what additional log entries etc we may need to investigate ?
Thanks.
 
5 REPLIES 5

Scott_Murray
Level 3
Just some additional information in this regard.
We noted that the command "bpversion" runs on the Media Server when the DSSU de-staging is invoked.
This "bpversion" command never seems to run to completion, and maybe what is holding up the whole process.
 
If we run the "bpversion" exe file from a command line on the Media Server, it sits there forever, and occasionally throws up a message:
 
cannot connect to bpcd on MASTER_SERVER_NAME: cannot connect on socket (25)
cannot connect to bpcd on MASTER_SERVER_NAME: cannot connect on socket (25)
 
This seems to indicate network issues, however the DNS forward and reverse are OK, backups and restores run just fine, "bpclntcmd" indicates network connectivity is fine.
 
Just wondering if anyone knows anything about what this "bpversion" command might be trying to do ??
 
Thanks.
 

sdo
Moderator
Moderator
Partner    VIP    Certified
Did you check for duplicate DNS names and reverse IPs, i.e. do nslookup <name>, and nslookup <IP> twice, and see if you get different IPs and/or different names back?
Is the master server name the first name in the list of SERVER entries?  Try this:  "bpgetconfig -M <media-name> SERVER".

Scott_Murray
Level 3
Hello and thanks for your reply.
 
Yes, we have checked all DNS - forward and reverse resolution. Backup and Restore run successfully (and bpclntcmd checks out all OK).
We have also checked the SERVER entries in the configuration for these Media Servers and all is OK as well.
We're investigating whether some other IP ports may need to be opened up for this process as well.
Thanks.
 

Stumpr2
Level 6
Here is a reference you may or may not want to review for ports
VERITAS NetBackup (tm) 6.0 Port Usage Guide for Windows and UNIX Platforms
 
 

Scott_Murray
Level 3
Thanks "Stumpr - guru" for that.
We're still chasing network/comms configs to see if there are certain ports being blocked as well.
Looking at the "admin" log file on the Media Server, when trying to run the "bpversion" command from a command line, we see the errors:
 
nb_getsockconnected: Connect to MASTER_SERVER_NAME on port 606
logconnections: bpcd CONNECT FROM MEDIA_SERVER_NAME .606 TO MASTER_SERVER_NAME .13782
logconnections: BPCD CONNECT FROM MEDIA_SERVER_NAME .606 TO MASTER_SERVER_NAME .13782
bpcr_connect: bpcr_connect timeout during select after 60 seconds on port 939
bpcd_version_notify: cannot connect to bpcd on MASTER_SERVER_NAME :  cannot connect on socket (25)
bpversion: bpversion.c.171: function bpcd_version_notify failed: 58
 
    Do you know what this "bpcr_connect" is doing ?
 
Thanks.