cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup AIR over FC consuming a lot of network bandwidth

fira_gojira
Level 3
Hello,

NBU version 8.2
Master server Linux
Appliance 5240 version 3.2

We are facing issue with replication jobs. All the instructions in the guidelines for setting AIR using fibre channel are followed. The replication jobs work well, and the speed are quite high, so we thought that the replication over fc is working accordingly.

However, about a week ago, network team found out that netbackup appliance are consuming a lot of bandwidth during replication (not sure since when this actually happened), so they put a cap to limit the consumption of bandwidth for netbackup replication.

The network report show that the port that is going through the LAN network is 10082, which is the spoold port.

Logging the case to veritas support doesn't help at all. They said that data and spoold and data transfer should go through only one way(fc or lan). But clearly that is not the case.

Now, there are a lot of replication jobs running with no progress(0kb), especially the big sized ones. Even the small size image is replicating very slowly. So obviously network capping is taking effect.

Appreciate if someone in here colud help me with this.

Thanks in advance
12 REPLIES 12

Krutons
Moderator
Moderator
   VIP   

Are you replicating from a NetBackup appliance to another NetBackup appliance?

Yes, both source and target are in the same model and version of netbackup and appliance.

Hi
How did you setup FC connections between 5240 appliance?
Is this a new configuration or its an old thing that working for you for quite some time?

Hi Pats, this setup has go live for almost a year with no problem. The fc is connected through an IBM fcip router with zoning configured.

We can see transactions/traffic going on at the ports connected to the appliance at the fcip router, even now. But there are also large amount of traffic going through the LAN network since who knows when, and it's clearly impacting the replication jobs when capped.

Possible to post output for this command here on community?
Main > Settings > FibreTransport Deduplication Show

Get this output from source and target appliance.

Hi pats,

Source:

.Settings> FibreTransport Deduplication Show
[Info] Fibre Transport Deduplication is disabled.

 

Target:

.Settings> FibreTransport Deduplication Show
[Info] Fibre Transport Deduplication is enabled.

I've run the command at other appliances and the result are all like this. Is this normal for the receiving end only to be enabled?

This setting is correct if it’s a one way replication (source —> target).

Check output for command “Manage > FibreChannel > Show “

command to verify and confirm the status of the Fibre Transport Deduplication feature is configured and working. You can verify the following from the output:

The qla2x00tgt, scst and scst_user drivers are loaded.

The Configuration Type column shows Target(MSDP) for the ports that you have configured as target ports.

The Physical State column shows the Target mode for the ports with the configuration type Target(MSDP).

Under the Status column, the target mode ports should have a status of Online.

Hi pats, thanks for the detailed explanation. I have checked the properties from what you have described and all configurations seems to be correct.

I guess the ultimate questions now is what actually spoold does? Is it normal for setup like this for spoold to be going through other line of networks? Does it actually carry the replication data alongside the other network too?

Hi
according to you when a capping was imposed on network then only performance impact was observed.
Here my question is did you ever verified if the replication was going over FT?

To investigate this we need to look in some log files.
Check for bptm, bpbrm and Nbftsrvr logs from source media appliance.


Hi, what lines should be observed from the logs?

As i have mentioned before, we can see traffic/transaction going on from fcip router management tool at the ports related. From activity monitor, we can also se this line at detailed status

Apr 5, 2021 12:25:48 AM - Info sourceA (pid=164219) StorageServer=PureDisk:sourceA; Report=PDDO Stats for (sourceA): scanned: 16425315 KB, CR sent: 70672 KB, CR sent over FC: 59586 KB, dedup: 99.6%, cache disabled, where dedup space saving:98.0%, compression space saving:1.5%
Apr 5, 2021 12:25:48 AM - Replicated backup id client_1617386992 successfully
Apr 5, 2021 12:25:48 AM - Info bpdm (pid=164219) EXITING with status 0

 According to veritas support, this is the the CR sent over FC is the indicator that the data is going through FC.

Hi @fira_gojira 

IIRC, it is only the data that is transferred over FC. The catalog metadat is still sent via the network. As per the log entry - you are sending 70M via the network which should be the metadata to allow the import to occur, and 60MB via FC which would be the backup image data (optimized duplication from the 16.4G source).

Now this shouldn't really affect your network bandwidth as typically the backup metadata is small compared to the data (the only time this may be different is for a file server with a large number of small files - but that would be an exception I would have thought). 

Does the network capping affect the overall transfer rate?

David

Hi David,

Thanks for the explanation. That actually makes sense. To answer your question, the kbps rate in the detailed status looks much smaller but the replication is actually not so slow. The slowest part is for the Current kb Written  to increase from 0kb. We have one 3tb image replication job that stayed at 0kb for 155 hours, and when the Current kb Written start going up, the job finish in less than 2 hours. The kb/s in detailed status only shown 6478. This happened after the network capping takes effect.