β03-12-2013 08:34 AM
Hello I am facing a network issue and I suspect it is related to netbackup. In cases, when some backup/restores are launched, we get a lot of drop on our network. I ran statistics on the Netbackup master server and I found something unusual compared to other servers: The loopbackup interface lo has very high traffic. When I sniffed the packets, i get a lot of packets related to pbx. Do anyone have and idea what is going on here?
high traffic on lopbackp interface:
11:32:01 AM IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s
11:32:02 AM lo 416.16 416.16 378125.25 378125.25 0.00 0.00 0.00
11:32:02 AM eth0 74.75 36.36 7197.98 3993.94 0.00 0.00 13.13
11:32:02 AM eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:32:02 AM eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:32:02 AM eth3 40.40 55.56 3507.07 23105.05 0.00 0.00 52.53
11:32:02 AM sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:32:02 AM bond0 115.15 91.92 10705.05 27098.99 0.00 0.00 65.66
11:32:02 AM bond1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Packets are related to pbx:
11:31:42.403007 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.62596: P 365:389(24) ack 1 win 256 <nop,nop,timestamp 501947160 501946359>
11:31:42.403028 IP sesegx10.sagir.qc.62596 > sesegx10.sagir.qc.veritas_pbx: . ack 389 win 385 <nop,nop,timestamp 501947160 501947160>
11:31:42.403041 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.62596: P 389:413(24) ack 1 win 256 <nop,nop,timestamp 501947160 501947160>
11:31:42.403054 IP sesegx10.sagir.qc.62596 > sesegx10.sagir.qc.veritas_pbx: . ack 413 win 385 <nop,nop,timestamp 501947160 501947160>
11:31:42.904023 IP sesegx10.sagir.qc.52613 > sesegx10.sagir.qc.veritas_pbx: P 732956709:732956713(4) ack 733193368 win 257 <nop,nop,timestamp 501947661 501927655>
11:31:42.904033 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.52613: . ack 4 win 265 <nop,nop,timestamp 501947661 501947661>
11:31:42.904067 IP sesegx10.sagir.qc.52613 > sesegx10.sagir.qc.veritas_pbx: P 4:8(4) ack 1 win 257 <nop,nop,timestamp 501947661 501947661>
11:31:42.904072 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.52613: . ack 8 win 265 <nop,nop,timestamp 501947661 501947661>
11:31:42.904080 IP sesegx10.sagir.qc.52613 > sesegx10.sagir.qc.veritas_pbx: P 8:12(4) ack 1 win 257 <nop,nop,timestamp 501947661 501947661>
11:31:42.904084 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.52613: . ack 12 win 265 <nop,nop,timestamp 501947661 501947661>
11:31:42.904092 IP sesegx10.sagir.qc.52613 > sesegx10.sagir.qc.veritas_pbx: P 12:24(12) ack 1 win 257 <nop,nop,timestamp 501947661 501947661>
11:31:42.904095 IP sesegx10.sagir.qc.veritas_pbx > sesegx10.sagir.qc.52613: . ack 24 win 265 <nop,nop,timestamp 501947661 501947661>
11:31:42.923100 IP sesegx10.sagir.qc.veritas_p
Solved! Go to Solution.
β03-18-2013 05:23 PM
It is curous, and I doubt that this transmits was raised by other programs. NetBackup does not mind which clients are in same subnet(except planning and deploying of BMR).
Please check which process transmit these packets by lsof, strace, or so. Also check vnetd log in clients to determine what request the master made to the clients.
β03-12-2013 02:25 PM
Drop of network packages is related to bad fiber optics. If only one of the two fibers are damages backup may work OK but when restore start the limited throughput (because of re-transmissions) impact the entire operation.
β03-13-2013 02:07 AM
It's only 378kB/s. Not si high, i feel.
Please chech if drop occurs when you transfer large file with other applications like ftp, scp, rsync or so.
β03-18-2013 11:39 AM
We noticed that during a restore/backup from the master to a specific client, we have packets related to the vnetd port transmitted to ALL the clients in the same subnet. We have doupt that this is flooding the network causing packet drops. Could this be any netbackup master/client setting?
Example TCPDUMP from any client having NO backup restore in progress:
14:36:58.235193 IP (tos 0x0, ttl 64, id 4193, offset 0, flags [DF], proto 6, length: 1500) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38806400:38807848(1448) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235244 IP (tos 0x0, ttl 64, id 4194, offset 0, flags [DF], proto 6, length: 5844) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38807848:38813640(5792) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235289 IP (tos 0x0, ttl 64, id 4198, offset 0, flags [DF], proto 6, length: 4396) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38813640:38817984(4344) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235292 IP (tos 0x0, ttl 64, id 4201, offset 0, flags [DF], proto 6, length: 1500) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38817984:38819432(1448) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235337 IP (tos 0x0, ttl 64, id 4202, offset 0, flags [DF], proto 6, length: 4396) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38819432:38823776(4344) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235386 IP (tos 0x0, ttl 64, id 4205, offset 0, flags [DF], proto 6, length: 5844) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38823776:38829568(5792) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235433 IP (tos 0x0, ttl 64, id 4209, offset 0, flags [DF], proto 6, length: 2948) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38829568:38832464(2896) ack 1 win 46 <nop,nop,timestamp 1031455312 104386367>
14:36:58.235689 IP (tos 0x0, ttl 64, id 4211, offset 0, flags [DF], proto 6, length: 1500) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38832464:38833912(1448) ack 1 win 46 <nop,nop,timestamp 1031455313 104386367>
14:36:58.235739 IP (tos 0x0, ttl 64, id 4212, offset 0, flags [DF], proto 6, length: 5844) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38833912:38839704(5792) ack 1 win 46 <nop,nop,timestamp 1031455313 104386367>
14:36:58.235784 IP (tos 0x0, ttl 64, id 4216, offset 0, flags [DF], proto 6, length: 2948) MASTER.SERVER.22263 > NBU.CLIENT.vnetd: . 38839704:38842600(2896) ack 1 win 46 <nop,nop,timestamp 1031455313 104386367>
9555 packets captured
9619 packets received by filter
54 packets dropped by kernel
real 0m0.720s
user 0m0.109s
sys 0m0.138s
β03-18-2013 05:23 PM
It is curous, and I doubt that this transmits was raised by other programs. NetBackup does not mind which clients are in same subnet(except planning and deploying of BMR).
Please check which process transmit these packets by lsof, strace, or so. Also check vnetd log in clients to determine what request the master made to the clients.
β03-20-2013 10:58 AM
I am closing the post since we noticed that all packets are flooding the network and not just netbackup vnetd. Thank you all.