cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup vmware backup stalls often for sharepoint servers

pnobels
Level 3

Hi,

we often see that vmware backups of specifically sharepoint servers seem to stall.  No error in Netbackup, it just seems to hang...  This only happens with vm's hosting Sharepoint.  It's not always but often.

The only solution there is to cancel the backup, kill the bpbkar32 process and restart the backup.  Sometimes it hangs again, sometimes it just takes the backup.  There's no real indication why and when this happens.

We backup a few hundred vm's.  The issue only shows up with vm's hosting sharepoint for some reson...

Does this ring a bell?

Today i took a dmp of the bpbkar32 process, and it seems to hang in libfs_ntfs :

ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION} Breakpoint A breakpoint has been reached.

EXCEPTION_CODE_STR: 80000003

STACK_TEXT:
00000000`00f47eb0 00007ffe`ff62f9ab : 00000000`00000007 ffffffff`fffffffe 00000000`021bead0 00007fff`008fe376 : libfs_ntfs+0x42b8
00000000`00f480d0 00007fff`008f920c : 00000000`00000000 00000000`021be7d8 00000000`00000000 00007ffe`ff630423 : libfs_ntfs!rvp_map_get_extent+0x5b
00000000`00f48120 00007fff`008fa3f3 : ffffffff`ffffffff 00000000`00f483e0 00000000`020fa2b0 00000000`020fa2b0 : libos_win!rvp_map_fini+0x102dc
00000000`00f48200 00007fff`008fa715 : 00000000`00000000 00000000`020fa2b0 00000000`00f483e0 00000000`00000000 : libos_win!rvp_map_fini+0x114c3

1 ACCEPTED SOLUTION

Accepted Solutions

Hamza_H
Moderator
Moderator
   VIP   

I think there is an EEB for this as you said it get stuck at Getextent 

the EEB is 3973811

you can check this :

https://www.veritas.com/content/support/en_US/downloads/update.UPD770735

 

extract :

Abstract

NetBackup 8.1.2 HotFix - VMware backups in a hung state Etrack 3973811

Description

Veritas Bug ID: ET 3973811

 

Version: NetBackup 8.1.2

 

Fix Included resolves: 

 

VMware backups hang during getExtent call occasionally. This EEB changes assignment of a negative

number to an unsigned variable, logs once every 10k iteration of a file, and will fail the extent

calculation if the same offset is seen 10k times in a row.

 

Install on: Media Server that is the VMware backup host

 

maybe it is worthy to install this EEB and monitor next backups.

 

Good luck :)

View solution in original post

8 REPLIES 8

pats_729
Level 6
Employee
Do you see any exceptional CPU or Memory spike during backup on SP servers ? Or even out of backup window?

Is this VMware policy protecting share point or share point policy?

Hi,

there's no archived performance logging available on these boxes.  I configured some now.  Will need to check if i can simulate it.  Or wait till next week...

Not sure what you mean with the last line.  This is a simple vmware backup of a Windows vm which happens to run a sharepoint environment.   It's not a specific sharepoint backup policy.  The netbackup policy is not aware there's a sharepoint environment on there...

jnardello
Moderator
Moderator
   VIP    Certified

At what point in the backup is it hanging ? These VMs that happen to be running Sharepoint, are they a lot larger than the other VMs that always work ? Do they all happen to be located on the same ESX ? The same DataStore ? Do they have a metric ton of load on them and so may be exceeding a timeout value somewhere while waiting to be snapshoted ? Are your VM admins having to add/move these VMs around semi-regularly for reasons & interrupting your backup processes ? Any errors showing up from vCenter during the problem windows ? 

Please post the job details to start with. 

The snapshot is taken.  

These VMs that happen to be running Sharepoint, are they a lot larger than the other VMs that always work ? No.

Do they all happen to be located on the same ESX ? No

The same DataStore ? Not necessarily.  Some are , others are not.

Do they have a metric ton of load on them and so may be exceeding a timeout value somewhere while waiting to be snapshoted ?  No.  Snapshot seems to be okay.  Looks like it is the start of the backup where it goes wrong...

Are your VM admins having to add/move these VMs around semi-regularly for reasons & interrupting your backup processes ? No

Any errors showing up from vCenter during the problem windows ? No

20-jan-2021 15:40:14 - Info nbjm (pid=4884) starting backup job (jobid=5750954) for client SRV-BE-092, policy VMware_Sharepoint, schedule FULL
20-jan-2021 15:40:15 - estimated 58433214 kbytes needed
20-jan-2021 15:40:15 - Info nbjm (pid=4884) started backup (backupid=SRV-BE-092_1611153615) job for client SRV-BE-092, policy VMware_Sharepoint, schedule FULL on storage unit SU_SRV-080_VMware-AllDisks using backup host srv-be-080.blabla.com
20-jan-2021 15:40:16 - started process bpbrm (pid=63944)
20-jan-2021 15:40:17 - Info bpbrm (pid=63944) SRV-BE-092 is the host to backup data from
20-jan-2021 15:40:17 - Info bpbrm (pid=63944) reading file list for client
20-jan-2021 15:40:17 - Info bpbrm (pid=63944) starting bpbkar32 on client
20-jan-2021 15:40:17 - connecting
20-jan-2021 15:40:17 - connected; connect time: 0:00:00
20-jan-2021 15:40:18 - Info bpbkar32 (pid=92424) Backup started
20-jan-2021 15:40:18 - Info bpbkar32 (pid=92424) archive bit processing:<enabled>
20-jan-2021 15:40:18 - Info bptm (pid=64188) start
20-jan-2021 15:40:18 - Info bptm (pid=64188) using 1048576 data buffer size
20-jan-2021 15:40:18 - Info bptm (pid=64188) setting receive network buffer to 4195328 bytes
20-jan-2021 15:40:18 - Info bptm (pid=64188) using 512 data buffers
20-jan-2021 15:40:19 - Info bptm (pid=64188) start backup
20-jan-2021 15:40:19 - begin writing
20-jan-2021 15:40:21 - Info bpbkar32 (pid=92424) INF - Backing up vCenter server srv-be-065.blabla.com, ESX host esx-be-004.blabla.com, BIOS UUID 422760ef-5f73-aba7-04ce-b7383337602e, Instance UUID 5027bad3-080b-3656-b6c5-79c2752e2afd, Display Name SRV-BE-092, Hostname SRV-BE-092.BLABLA.com

-> stops here...

pats_729
Level 6
Employee

not a definate answer but .... did you verified if the VMware Tools are updated on this VM's ? If not worth to verify.

Hamza_H
Moderator
Moderator
   VIP   

I think there is an EEB for this as you said it get stuck at Getextent 

the EEB is 3973811

you can check this :

https://www.veritas.com/content/support/en_US/downloads/update.UPD770735

 

extract :

Abstract

NetBackup 8.1.2 HotFix - VMware backups in a hung state Etrack 3973811

Description

Veritas Bug ID: ET 3973811

 

Version: NetBackup 8.1.2

 

Fix Included resolves: 

 

VMware backups hang during getExtent call occasionally. This EEB changes assignment of a negative

number to an unsigned variable, logs once every 10k iteration of a file, and will fail the extent

calculation if the same offset is seen 10k times in a row.

 

Install on: Media Server that is the VMware backup host

 

maybe it is worthy to install this EEB and monitor next backups.

 

Good luck :)

Applied the hotfix.  Seems to have solved our issue.  Thx!

Hamza_H
Moderator
Moderator
   VIP   

Great, glad that helped :)

Thanks for your reply