Dear all ,
i have an backup enviroment contains solarois master server and windows media server
i have alot of exchange databases distributed on 3 nodes "dag "
recently i noticed that alot of database failed with status code 4200 "
Operation failed: Unable to acquire snapshot lock (4200)
on the same node and other policies complete successfully on the same node
most of my backup policies start from 5pm till 4 am next day
i have searched for the status code error , it indicate that the snapshot processis taken by another process but i
can't point to the main problems
i think that the problem due to current running jobs to same node ,
but every thing was running successfully with out making any changes to the enviroment , when i run the failed
policies during the working hours , it complete successfully with out any errors
another databases on other nodes of the cluster running successfully
You said that you cant find any problems on 3 dag nodes right?
Did you check the VSS writers on all nodes? Is there another backup policy for this client?
Ex: MS-Windows or VMware (if this clients was a VM)
Im asking you because this status code means that:
Message: Operation failed: Unable to acquire snapshot lock
Explanation: The snapshot is being accessed by another operation such as a restore, browse, or copy to storage unit.
Recommended Action: Retry the operation when the snapshot is no longer used by another operation.
Can you run the command vssadmin list shadows? To check if there is some snapshot hang on client?
Please, check this article.
Could we see a bpfis log from a node where the failure occurs? Please set the client logging levels to 3 or more and 2. Also, please tell the versions of NetBackup client, Windows, Exchange.
Hi Thiago ,
Thanks for your reply , yes when i failover the failed database to other nodes of the dag cluster , they work successfully , but some databases backup are taking from 2 two nodes , i run the commands on 3 nodes of the clusters , found no errors , is there any way to know which process is used the snapshot ?
i didn't run restore , browse or copy to storage unit during the backup , i have alot of databases a around 15 ,
that is why they run concurrent after working hours
Thanks for your reply , i'm going to upload it one i back to my office
but will it help to check which process is used snapshot , also do you need more logs to upload
Let's start with just the bpfis logs on the Exchange servers (NetBackup clients). Again, please set the main logging level (sometimes called the "verbose" logging level) to at least 3 and the Windows logging level (a.k.a. the "general" level) to 2.
The snapshot lock is a file in NetBackup\online_util\fi_cntl. The filename includes the "fis id," which is in backup-id format <dag name>_<timestamp>. This makes the file unique to the particular snapshot job.
Various processes that create, use, or release the snapshot acquire the lock by creating the lock file or putting a read or write lock on it. I haven't studied the details of the logic. The bpfis log may help me or someone else follow what's happening to your job.
Perhaps you will figure it out yourself by studying the log. Look for the interaction of different process id's. Look for the name of the lock file, then look for the existence of the file in the NetBackup\online_util\fi_cntl folder. If you can't find the lock file name, look for the fis id near the beginning of the log of the process that fails, then look for all occurrences of that name in the log and in the fi_cntl folder.
Please provide versions of NetBackup, Windows, and Exchange.
You have 15 databases. Are they all backed up on the same client? How many concurrent backups execute on each client? Does your use of the "database source" policy option cause two snapshot jobs to execute on a client - one for active databases on the client and the other for passive databases? I guess this means I'd like to see the bpresolver log. 8-)