cancel
Showing results for 
Search instead for 
Did you mean: 

Backup SAN VMware Slow

Juliano_Moreira
Level 4

Hello 
I have a very big doubt with respect to my VMware backups of my Netbackup Appliance 5220 he is doing. 

Efetuei a recent update to version v2.6.0.3 and realized that there was a significant delay in my backups. Looking better in Master Servers Properties> Fibre Transport marked the Fail option, which in case of failure by FT error happen and not move across the LAN. 

In my political backup vmware, marked only the SAN option and put as first on the list also have no problem and not leave the LAN. 
Backups are performed with much ado successfully, but if I click on the property of Job Status Details on the type of transport appears as LAN. 

My question is this, is there something else that can actually verify that my backups are out on the SAN or not? 

I ask this because the rate at KB are 3MB +/- and have a certain fear of being still not being done by the SAN. 
If I analyze the connections with my storage, are also all connected and have no channel Down, ie not physically changed anything since deployment. 

Only two details that I would put on and this may be the cause. 

If I get into the web console equipment in Settings> Network> Fibre Transport> the two options are disabled. 

* Enable SAN Client Fibre Transport Media Server on the [FT use for backups to this Appliance]. 
This option requires the SAN Client license on the NetBackup master server. Click the help icon for more information. 


* Enable Fibre Transport for duplication and backups on the Deduplication Appliance. 

Remembering that my scenario is only 5220 and another set of physicians 30TB. Tape not own. All backups are on disks. 

I'm afraid that after updating something was disabled and do not know whether or not to activate. 


Has anyone had anything like this problem?

 

Tks.,

Juliano Moreira

19 REPLIES 19

sdo
Moderator
Moderator
Partner    VIP    Certified

Apologies, but I'm wondering if we are mixing up terms here - re either:

1) VMware backups being performed; from storage, via SAN, to NB5220

...or:

2) VMware backups being performed; from storage, via SAN to another media server; via FT SAN; to NB5220

...or:

3) VMware backups being performed; from storage, via SAN to NB5220, via FT SAN to another media server

 

I'm not even sure that options 2 or 3 are possible - so I'm going to assume option 1 above.

In which case 'FT' SAN media server has nothing to do with the configuration.

 

So, assuming the above is true, can I ask;

4) What version of ESX vSphere is running on the VMware environment?

5) Are you sure that you had VMware backups running faster than 3 MB/s before the patch/upgrade to v2.6.0.3?

...and assuming that 5) yes the VM backups used to run much faster than 3 MB/s... then I'll ask:

6) Are the VM guests that you are trying to backup on local disk within the ESX host(s), or on shared SAN storage?

7) Regarding the SAN based storage array that hosts the guest VMs and which is zoned to the ESX hosts; is this same storage array also zoned to the NB5220?

8) Are the shared storage LUNs that are presented the ESX nodes (clustered?) also presented out to the NB5220?

Regards.

chashock
Level 6
Employee Accredited Certified

Need some clarification to help with this.  There is no LAN transport type in a VADP backup.  There is SAN, NBD, NBD-SSL, and Hot-Add.

Are you showing NBD as the transport method in the Job Details?

Are all of the datastores available via SAN?  Have you checked your zoning?  If one of the datastores isn't availalbe that is needed in a VADP policy, the backup doesn't go back and check if the others are, it will default to the lowest common denominator.

And as sdo points out, you can't do FT backups, you perform SAN Transport method for VMware backups.

Juliano_Moreira
Level 4

Yes, just the one option I'm referring to. 

Really the activity monitor the type of transport appears as LAN. 
The slowness started noticing after the update v2.6.0.3, it did not take much before. 
The speed depends on a lot, I saw that is no more than 5 MB 7 MB ... sometimes 8MB 

I have a scenario of 250 VMs in the resource limits VMWARE, I'm doing 4 at the same time. Is there any other parameter that can improve my speed?

Juliano_Moreira
Level 4

Hello, chashock 
Policy put in SAN as the first in the sequence. 

Inside Job Details appear this way when doing the backup. 

08/19/2014 16:32:31 - Info bpbrm (pid = 14706) bpbkar start on client 
08/19/2014 16:32:31 - Info bpbkar (pid = 15597) Backup started 
08/19/2014 16:32:31 - Info bpbrm (pid = 14706) Sending the file list to the client 
08/19/2014 16:32:31 - connected; connect time: 00:00:00 
08/19/2014 16:32:31 - begin writing

chashock
Level 6
Employee Accredited Certified

There is nothing in those job details that says it is LAN.  If you are referring to the front tab or the field above the actual job details, that's not where you want to look.  Look further in the job details.  You should see a line that says transport method = san/nbd/nbd-ssl/hot-add.  It takes a few entries being passed sometimes before that will show up.

You could look at limiting the number of backups per ESX host and datastore to make sure youi're not swamping any of those resources.

Are you using Accelerator?

Juliano_Moreira
Level 4

Yes, I'm using accelerator 

Below is a log of all backups. 

 

08/19/2014 16:31:33 - Info nbjm (pid = 9082) starting backup job (jobid = 747 722) for client ORQAAPP019L_BKP, OR_VM_QAAPP policy, schedule FULL_SEMANAL 
08/19/2014 16:31:33 - Estimated 0 kbytes needed 
08/19/2014 16:31:33 - Info nbjm (pid = 9082) started up (BackupId = ORQAAPP019L_BKP_1408476693) job is client ORQAAPP019L_BKP, OR_VM_QAAPP policy, schedule FULL_SEMANAL on storage-unit stu_disk_nb 5220-01 
08/19/2014 16:31:33 - bpbrm process started (pid = 14706) 
08/19/2014 16:31:34 - Info bpbrm (pid = 14706) starting bptm 
08/19/2014 16:31:34 - Info bpbrm (pid = 14706) media manager using Started successfully BPCD 
08/19/2014 16:31:35 - Info bpbrm (pid = 14706) nb-5220-01 is the host to backup data from 
08/19/2014 16:31:35 - Info bpbrm (pid = 14706) telling media manager to start backup on client 
08/19/2014 16:31:35 - Info bptm (pid = 14715) 262144 using data buffer size 
08/19/2014 16:31:35 - Info bptm (pid = 14715) using 30 data buffers 
08/19/2014 16:31:36 - Info bpbrm (pid = 14706) br spawning child process 
08/19/2014 16:31:36 - Info bpbrm (pid = 14706) child pid: 14756 
08/19/2014 16:31:36 - Info bpbrm (pid = 14706) bpsched sending msg: CONNECTING TO CLIENT FOR ORQAAPP019L_BKP_1408476693 
08/19/2014 16:31:36 - connecting 
08/19/2014 16:31:37 - Info bpbrm (pid = 14706) bpbkar start on client 
08/19/2014 16:31:37 - Info bptm (pid = 14715) start backup 
08/19/2014 16:31:37 - Info bpbkar (pid = 14759) Backup started 
08/19/2014 16:31:37 - Info bpbrm (pid = 14706) Sending the file list to the client 
08/19/2014 16:31:37 - connected; connect time: 00:00:00 
08/19/2014 16:31:37 - begin writing 
08/19/2014 16:56:08 - Info bpbkar (pid = 14759) 375073 bpbkar Waited times for empty buffer, delayed 481 264 times 
08/19/2014 16:56:20 - Info bpbrm (pid = 14706) media manager for backup id ORQAAPP019L_BKP_1408476693 exited with status 0: the requested operation was successfully completed 
08/19/2014 16:56:20 - end writing; write time: 12:24:43 
the requested operation was successfully completed (0)

sdo
Moderator
Moderator
Partner    VIP    Certified

Can you post output from:

bppllist OR_VM_QAAPP

Juliano_Moreira
Level 4

SDO

 

nb-5220-01:/home/maintenance # bppllist OR_VM_QAAPP
CLASS OR_VM_QAAPP *NULL* 0 760000 0 *NULL*
NAMES
INFO 40 0 0 99999 *NULL* 0 0 6 0 0 0 0 1 0 1 0 1 0 1408447300 FDC2CC00279211E4A6E787D1A33BA8E0 1 0 0 0 0 1 0 0 -1 0 0 0 0 1 0 3 0 22 1 28800 1 0 0 0 1
KEY *NULL*
BCMD *NULL*
RCMD *NULL*
RES *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL*
POOL NetBackup NetBackup NetBackup NetBackup NetBackup NetBackup NetBackup NetBackup NetBackup NetBackup
FOE 0 0 0 0 0 0 0 0 0 0
SHAREGROUP *ANY*
DATACLASSIFICATION Platinum
ACN nb-5220-01
SSM VMware_v2
SSMARGCOUNT 16
SSMARG file_system_optimization 1
SSMARG rTO 0
SSMARG snapact 2
SSMARG drive_selection 0
SSMARG Virtual_machine_backup 2
SSMARG enable_vCloud 0
SSMARG multi_org 0
SSMARG rHz 10
SSMARG rLim 10
SSMARG disable_quiesce 0
SSMARG nameuse 1
SSMARG ignore_irvm 1
SSMARG skipnodisk 0
SSMARG exclude_swap 1
SSMARG post_events 1
SSMARG trantype san
CLIENT nb-5220-01 VMware VMware 0 1 0 0 *NULL*
INCLUDE vmware:/?filter=Displayname Contains "ORQAAPP"
SCHED FULL_SEMANAL 0 10 604800 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 1 0 0
SCHEDWIN 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SCHEDRES DCCotia_QA_Semanal_Full_4S *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL*
SCHEDPOOL *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL*
SCHEDRL 0 1 1 1 1 1 1 1 1 1
SCHEDFOE 0 0 0 0 0 0 0 0 0 0
SCHEDSG *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL* *NULL*
nb-5220-01:/home/maintenance #

 

sdo
Moderator
Moderator
Partner    VIP    Certified
Several elements of the job log look a bit strange to me: - low wait counts - no mention of bpfis calling for a vmfs snapshot - 12 hr elapsed time for a 25 minute job - and it looks like a log for a plain client backup ...hmmm - I'll admit I'm confused by the activity job log shown. Juliano, if we asked for the bpfis log - do you know what to do? Btw the policy list looks ok to me.

sdo
Moderator
Moderator
Partner    VIP    Certified
And what version of vSphere are the ESX hosts running? And vCenter version? And just out of curiosity, what version was the NB5220 appliance before the patch/upgrade?

Juliano_Moreira
Level 4

Hello 
This log does not really know how to act. 
The following images are configured as one of the policies that I created to test the performance. It may be something in the policy that could be wrong too and do not know where is the big mistake. 

My Esxi is version 5.5 but started noticing this problem after I updated to the latest version v2.6.0.3

 

 

sdo
Moderator
Moderator
Partner    VIP    Certified

The policy screen shots look ok to me.

You could try copying the policy to a new 'test policy name' and then disable the hotadd, nbd and nbdssl options and so force it to only use SAN transport, and try running this new test policy.  Then send us the job activity log for that test backup job.

What version was the appliance before the patch/upgrade?

Do you know how to acquire the 'bpfis' log?

Juliano_Moreira
Level 4

The testing policy was so I ended up putting later, but left just to force SAN. 

before applying the path was in version 2.6.0.2 

collect these logs never did have any place where I can explain myself or the way in 5220 to collect? 

I collect it soon when running?

sdo
Moderator
Moderator
Partner    VIP    Certified

The NetBackup v7.6 Troubleshooting Guide is here:

http://www.symantec.com/business/support/index?page=content&id=DOC6470

...see around page 160 onwards re how to enable logging.

 

I'll try to explain it here, but if it doesn't make sense then you'll have to read that section of the manual...

...ok...

1) In the CLIsh in the appliance:  Support > Logs > Share Open

2) If you have a Windows machine, then explore to \\nb-5220-01\logs

3) Drill in to NBU/openv/netbackup/logs/bpfis

4) Copy the most recent, or relevant, log file out

5) Don't forget to Support > Logs > Share Close

 

if the above doesn't work, then I suggest:

1) Use PuTTY or an ssh client to logon to the appliance as 'admin' to get to the CLIsh

2) Maximize the screen for the PuTTY/ssh session

3) Support > Maintenance > elevate

4) ls -lash /usr/openv/netbackup/logs/bpfis

5) cat /usr/openv/netbackup/logs/bpfis/whichever.log

6) In PuTTY/ssh - select all text

7) Save to a file on your workstation

 

HTH.

sdo
Moderator
Moderator
Partner    VIP    Certified
Some other ways to get files off of an appliance: https://www-secure.symantec.com/connect/forums/nbu-appliance-5220

sdo
Moderator
Moderator
Partner    VIP    Certified

Hi Juliano - how are you getting on?

Juliano_Moreira
Level 4

Hello 
I'm still having problems and was unable to collect the logs that you tried to pass the steps.
I made some changes regarding the amount of Jobs while nonetheless ending continues to back up one day it would be normal for three days.

From what I've been researching some backups can reach 800 MB speed. 

At most they could achieve was 100MB early on after the peak is at 2-3 MB working the full VM backup. 

From my side of the storage I came to see if any channel was down. Apparently everyone is UP. 

Remembering that I'm working with Appliance 5220 with a set of disks 30TB. I have no backups to TAPE.

maurijo
Level 6
Partner Accredited

I think language is a problem here, no offense. We would need logs to help you further and it seems that you don't know how to collect these. So try reading sdo his explanation again or read the manual he linked.

Mark_Solutions
Level 6
Partner Accredited Certified

Just picking up this thread .. how did it go from a simple question to a full blown investigation!?

Let me try and answer your original question...

If your policy is set to use SAN with no other options and the backup works then you can rest assured that it used the SAN.

I agree that on the job details it says LAN .. they always do and it is something to do with the coding on the outside edge of the job details that has never been addressed - it has confused many in the past.

In the detailed status of the job itself .. in amongst all of the text it will say somewhere that the transport type is san .. this confirms the transport type being used.

The other are right .. Fibre Transport is a term used for a specific type of backup which does not apply to VMware backups .. but i know what you mean as the transport of data goes over the fibre .. again that confuses many people.

In summary if your backups work then they are going over the san .. you can confirm it in the detailed section of the job itself .. ignore the outside of the job window which always says LAN

Hope this helps