cancel
Showing results for 
Search instead for 
Did you mean: 

VSS blocks the launch of backup jobs in NBU 8.1

MatBams
Level 4

Hi,

On my NBU master, every day, i see one parent job without child jobs stayed in active state for one day or more as long as I cancel the parent job.

But, when I restart the job and I kill VSS process in my windows server, the child jobs launch.

I tried to modify the configuration of my VSS but it's not a success.

Do you have some informations for me or something else that i can try ?

Thanks

19 REPLIES 19

Lowell_Palecek
Level 6
Employee

What do you mean by killing the VSS process? There is no NetBackup process by that name.

Are you canceling a snapshot job in the Activity Monitor in the remote NetBackup console? Killing bpfis.exe in Task Manager? Something else?

Why are you killing whatever you're killing?

What kind of parent job persists in the Activity Monitor?

What kind of backup job is being blocked? How is it being blocked? Does the Activity Monitor tell you it's waiting for resources? Something else?

Sorry if i'm not clear.

When i start a manual backup on NBU for physical server with 6 disks, only the parent job is launched. But if i kill the VSS process on the physical server, all the jobs start under the parent job.


@Lowell_Palecek wrote:

What do you mean by killing the VSS process? There is no NetBackup process by that name.

I kill the VSS process on the windows server.

Are you canceling a snapshot job in the Activity Monitor in the remote NetBackup console? Killing bpfis.exe in Task Manager? Something else?

I cancel the parent job in the Activity Monitor because it's stucked in the weird state.

Why are you killing whatever you're killing?

Because if I kill the VSS process on my windows server when i restart the main job, the child jobs start too.

What kind of parent job persists in the Activity Monitor?

If you ask about the job type, it's a Backup.

What kind of backup job is being blocked? How is it being blocked? Does the Activity Monitor tell you it's waiting for resources? Something else?

I don't see an error or something else, just "Begin writing" state.




When I run a snapshot backup my Windows servers have a process named vssvc. This is the Microsoft Volume Shadow Copy service (VSS). Is that what you're killing?

What are the policy type and backup selections?

Do you find any clues in the job details or the NetBackup logs? I would look at the bpbkar and bpfis logs on the client first, then the bpbrm and bptm logs on the media server if it looks like the client process gets stuck waiting for a response on a socket.

Sorry for the delay.

Yes i kill this process when i run a manual backup and the snapshot backups start.

Policy type : MS-Windows
Backup selections : All_local_drives
On this policy, i have two clients with same OS and same Netbackup client version. One is working and the second is KO.

No clues in the job details. The last status, before "Termination requested by administrator", is "Backup started".

I create the folders on the client. I will see the logs tomorrow.

Are there any other shadow copy jobs scheduled by the windows admins or applications admins?

So in the bpbkar logs i've nothing. In bpfis, i have "Unable to open key Software\veritas\netbackup\currentversion\config\backup, error <2>
Updating FIM VSS to VSS:prov_type=0,snap_attr=0,max_snapshots=1"

Below in the log, i've "WRN - VfMS error 10; see following message: Non fatal method error was reported".

I specify that it's a physical server.

You are not giving us much to go on. What level are you logging at? Note that for Windows client host properties there are two log level settings, one under Logging and one under Windows Client.

The Registry error 2 is not an error to the workflow. It means that bpfis looked for a configuration parameter and didn't find it. Status 2 is the standard "not found" file system code. If the parameter doesn't exist, then bpfis takes the default action. I'm not familiar with this particular parameter.

The "Updating FIM VSS" line occurs in every Windows snapshot job. It's not an error.

The VfMS warning probably is nothing. The context around it would help to know what it signifies.

For Jim_MD, i start the vssadmin list providers command on my cmd, but the command is blocked. With vssadmin list providers command, i have all writers in Stable and No error. I tried to open the configuration of Shadow copies on ComputerManagement\Shared Folders but no answers.

For Lowell, okay I understand what you want. I configure my windows client host in NBU to log in level 5 and I enable critical logs.

I'll be back with more informations for you.

sdo
Moderator
Moderator
Partner    VIP    Certified

VSS is a cohesive set of multiple functional parts, and if any one part is failed/hung then one cannot expect VSS to work, therefore one cannot expect backups that rely upon VSS to work.

What this means is that backup admins always have to check that not only do all of these commands complete without error, but to also check the output from each command - and each of these five commands should always produce at least some output, and should always complete with status 0.  If any of these commands fail or hang or report errors then VSS is broken.

vssadmin list writers
vssadmin list providers
vssadmin list volumes
vssadmin list shadowstorage
vssadmin list shadows

VSS is a multi-layered approach, think of it as:  writers look for providers of storage, providers look for their volumes, and upon volumes some shadowstorage is allocated, and within shadowstorage the shadows are created.

I understand what you says.

I launch vssadmin list writers and it's OK.

But vssadmin list shadows/delete shadows/list providers are hang. I try to use DiskShadow but still the same.

Do you have some idea to repair VSS ? 

sdo
Moderator
Moderator
Partner    VIP    Certified

Definitely sounds like VSS is broken.  VSS is quite sophisticated software, and there is generally no simple immediate answer.  First thing to try is a reboot, then maybe start here:

https://docs.microsoft.com/en-us/windows/win32/vss/troubleshooting-vss-applications

sdo
Moderator
Moderator
Partner    VIP    Certified

So when i stop the VSS service by killing the PID, i can launched all the commands of vssadmin. And i don't find any problems in the result.

Maybe on the line "For Volume", i have the letter (C:)\\?\Volume(XXXXXXXXXX). I don't know if the question-mark is good.

In this state, I tried to launch a manual backup in NBU and still the same problem, no jobs in activity monitor.

sdo
Moderator
Moderator
Partner    VIP    Certified

1) IMO, killing Windows own services is not a good idea.  AFAICT, there should be no need to do that, ever.  IME, I have never had to do that, and I ,myself wouldn't recommend such action to anyone.

.

2) The ? in the volume output is ok.

.

3) One thing you can check (or five things that you can check), is to test the status of errorlevel after each call to "vssadmin list <something>", using :

echo %errorlevel%

...which should always return a status 0.  Any status other then zero indcates a problem with VSS.

.

4) I thought you said earlier that one, or some, of the VSS commands would hang ?

.

5) Either way, if a VSS command hangs, then you shouldn't have to kill a service process.  None of the "vssadmin list" commands should ever hang, and if one hangs, then VSS is broken, and killing the VSS service will never be an adequate solution.  I think you are going to have to google "windows how to reset VSS", and sift through all the rubbish, and see if you can find a suitable article describing how to reset VSS.

.

6) Have you tried a reboot yet ?

I saw that MS docs and  others...on one I made a comment in that no where does it say what is the max concurrent shadow copies that can be processed.  MS replied quickly with a .....not sure but I will find out from others....a bit strange. I thought it was one per drive, but I could be wrong.

3) I've error status 1, i'm gonna read your links.

4) The VSS commands isn't hanged if i kill the VSS PID before using the commands. When the VSS commands answer, I restart the manual backup. Afterwards, the VSS commands hang.

5) I search how to reset VSS yes

6) The server reboot every month and we have this problem for 5 months.

I'm reading the bpfis log in level 5 mode.

So I've accessed to the Shadow Copies configuration and on the all volumes the shadow copies are disabled.

sdo
Moderator
Moderator
Partner    VIP    Certified

@MatBams what you are looking at there is the ability of Windows to effectively schedule its own shadows.  This is something that you might find a Windows admin enabling and configuring on say a file server that is used to host user home drives, or maybe team / departmental file shares.

Anyway, you haven't disabled shadow copies as such, it's more a case of that you have not scheduled additional Windows own shadow copies.

MS replied - It confirmed that you can have only one VSS shadow copy per drive be run at any one time.  This is for Win 2008,  I assume it applies to later versions.

----- MS Reply -----------------

How many different volumes can have snapshots at the same time?

  • All of them. One or more snapshots may exist concurrently for each volume in the system.

How many snapshots can be active on a given volume at the same time?

  • There is an absolute maximum of 512. Practically the number is usually much less, depending on I/O load, available disk space, how big or how old the snapshots are, etc.

How many things can be creating a shadow copy at the same time?

  • One. The Shadow Copy Creation Process describes how creating a shadow copy works at the VSS API level. The documentation for IVssBackupComponents:: StartSnapshotSet states that one of the potential error codes is VSS_E_SNAPSHOT_SET_IN_PROGRESS, meaning “The creation of a shadow copy is in progress, and only one shadow copy creation operation can be in progress at one time.”

---------------------------------------------------