This problem happened again in our system (actually I reported similar problem https://www-secure.symantec.com/connect/forums/netbackup-client-failing-daily-incremental-ok-weekly-fulls but no solution yet). We are using NetBackup 22.214.171.124 Master server running on RHel 6.1, the media server that are in problem are running HPUX and Windows 2008.
This what was happening:
My quesions are:
Solved! Go to Solution.
Are you using dedupe disk pools like PureDisk or MSDP?
Were Netbackup servers affected by the power outage?
Was there any network hardware affected by the power outage?
Hi INT_RND and mpsh999,
Thank you for your repsonses.
We are using dedup by Data Domain not sysmantec, but the affected schedule/pools are using tape not the the dedup, our dedup only used for VM backup and they are fine now.
Most of our NetBackup Servers (includes the master) are affected by the power outage but I think they are fine after that. I think the problem mostly affected the clients (we have so many clients some of them not really important production servers so nobody will check whether they were ok after the outage).
Yes network hardware also affected, but should be fine after that.
I think the main caused of the problem is the problematic client and the wrong setting of our NetBackup system. (Like I mentioned earliear I have posted the same thing earlier, I still could not find the real caused, but last time my solution just find out the problematic client then stop the backup to that client. This time is just too many)
Like mph99 said I also think there is some timeout setting not set correctly, any idea?
I'll take a guess, since you said "...the affected schedule/pools are using tape..." and "Most of our NetBackup Servers (includes the master) are affected by the power outage..."
Assuming you are using a tape library, do you know if you are using the SCSI "persistent binding" settings with your media servers' HBAs connected to your tape drives? If not, this *could* cause the behavior you see, where a backup starts but no data ever gets transferred. The persistent binding is what tells your Operating System what logical path to use when re-establishing the connection to the HBA/tape drives after the server has rebooted. This is not NetBackup; this is the OS communicating to the tape drives through the HBA. Remember that NetBackup is not the one connected to the tape drives, its the OS, and NetBackup communicates with the OS.
Without persistent binding, the drive that was previously using one path is no longer there,; it is now waiting for communication from another path, but since this occurs between the OS and the tape drive, NOT Netbackup, then NetBackup is still trying to send the data to a path that is no longer connected.
To fix this, use the normal set of utilities (scan, etc.) to make sure the OS still sees the tape drives. Then, delete the tape drives and robot from the NetBackup GUI and allow it to re-discover everything. If everything is connected and communicating correctly between the OS and the HBA and tape drives, NetBackup will rebuild the tape drive definitions and everything will work again.
I am still not sure about the hardware, I still believe this could be some setting in my NetBackup is not correct. Now after we can identify the problematic clients (the clienst also inclused some virtual IPs from package/service of a clustered client) then either whe don't backup the clients anymore or fix the clients, the problem goes away.
Actually this one is not really solving the problem, since problematic clients will be there again someday, this thing could happen again.