cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

Failed SQL Backup due to <DBName> is involved in another operation

fdiaz
Level 3

Is anyone experiencing SQL 2005/2008 DB backup failures with ERR - Error <DATABASENAME> is involved in another operation.  I have been upgrading all the SQL Clients to 7.1.0.2 from 6.5.5/7.0 and then begin experiencing failures during incremental scheduled backups. Later on during the day when the failed backups are reconciled the back successfuly complete. I have tried changing the scheduling but it did not work. Anyone have any ideas on what logs to enable from NetBackup or SQL?

 

 

10/22/2011 5:30:00 AM - Info nbjm(pid=4808) starting backup job (jobid=468647) for client hdcuniprdbsv026.backup.hdc.mdsol.com, policy HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D, schedule Daily_Differentials  
10/22/2011 5:30:00 AM - Info nbjm(pid=4808) requesting MEDIA_SERVER_WITH_ATTRIBUTES resources from RB for backup job (jobid=468647, request id:{23F54E2D-55E1-4E88-8221-85CB7ACE0AF8})  
10/22/2011 5:30:00 AM - requesting resource PROD_DD_STU2
10/22/2011 5:30:00 AM - requesting resource hdcbackup03.NBU_CLIENT.MAXJOBS.hdcuniprdbsv026.backup.hdc.mdsol.com
10/22/2011 5:30:00 AM - requesting resource hdcbackup03.NBU_POLICY.MAXJOBS.HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D
10/22/2011 5:30:00 AM - granted resource hdcbackup03.NBU_CLIENT.MAXJOBS.hdcuniprdbsv026.backup.hdc.mdsol.com
10/22/2011 5:30:00 AM - granted resource hdcbackup03.NBU_POLICY.MAXJOBS.HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D
10/22/2011 5:30:00 AM - granted resource PROD_DD_STU2
10/22/2011 5:30:00 AM - estimated 0 Kbytes needed
10/22/2011 5:30:00 AM - Info nbjm(pid=4808) started backup job for client hdcuniprdbsv026.backup.hdc.mdsol.com, policy HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D, schedule Daily_Differentials on storage unit PROD_DD_STU2
10/22/2011 5:30:01 AM - started process bpbrm (2720)
10/22/2011 5:30:03 AM - Info bpbrm(pid=2720) hdcuniprdbsv026.backup.hdc.mdsol.com is the host to backup data from     
10/22/2011 5:30:03 AM - Info bpbrm(pid=2720) reading file list from client        
10/22/2011 5:30:03 AM - connecting
10/22/2011 5:30:06 AM - Info bpbrm(pid=2720) starting bphdb on client         
10/22/2011 5:30:06 AM - Info bphdb(pid=4028) Backup started           
10/22/2011 5:30:06 AM - connected; connect time: 00:00:03
10/22/2011 5:30:11 AM - Info dbclient(pid=4484) ERR - Error <hdcvcl06042> is involved in another operation.  
10/22/2011 5:30:13 AM - Info dbclient(pid=4484) INF - OPERATION #1 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch FAILED with STATUS 3 (0 is normal). Elapsed time = 5(5) seconds.
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - BACKUP STARTED USING        
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64)    
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Jun 17 2011 00:54:03         
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Copyright (c) Microsoft Corporation         
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor) 
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Batch = HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch, Op# = 2       
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - Using backup image hdcuniprdbsv026.MSSQL7.HDCUNIPRDBSV026.inc.hdcvcl07018.~.7.001of001.20111022053015..C       
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - backup database "hdcvcl07018" to VIRTUAL_DEVICE='VNBU0-4484-4652-1319261417' with  stats = 10, blocksize = 65536, maxtransfersize = 4194304, buffercount = 2, differential
10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - Number of stripes: 1, Number of buffers per stripe 2. 
10/22/2011 5:30:18 AM - Info dbclient(pid=4484) INF - Created VDI object for SQL Server instance <hdcuniprdbsv026>. Connection timeout is <300> seconds.
10/22/2011 5:30:52 AM - Info dbclient(pid=4484) INF - Thread has been closed for stripe #0    
10/22/2011 5:31:04 AM - Info dbclient(pid=4484) INF - OPERATION #2 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch SUCCEEDED with STATUS 0 (0 is normal). Elapsed time = 49(0) seconds
1 ACCEPTED SOLUTION

Accepted Solutions

Mark_Solutions
Level 6
Partner Accredited Certified

I dont believe it has changed that much so i would say it is a coincidence

If theyt do transaction logs all day then as far as your NetBackup jobs go you may as well just do your nightly full backups

Hope this has resolved things for you

View solution in original post

9 REPLIES 9

Marianne
Level 6
Partner    VIP    Accredited Certified

NBU: dbclient (under ...\veritas\netbackup\logs) and progress logs in ...\veritas\netbackup\logs\user_ops\mssql\logs.

SQL: SQL error logs (ERRORLOG usually located in <install_path>\mssql\log), the VDI log  (vdi.log usually in  <install_path>\mssql\log), and the Application Event Viewer log.

Indjev
Level 4

I see this when TX and DB backup runs in the same time.

I got this error when I had an old vnetd (maybe hung) process.

 

Restart  Netbackup services and related processes and try again.

If you have SQL admin, ask them about some internal backups that runs in the same timeframe

Mark_Solutions
Level 6
Partner Accredited Certified

Check that your SQL admins are not running SQL Maintenace schedules and doing their own backups which can conflict.

Also, if there been any problems with backups, check that there are no any orphaned dbbacked.exe processes running when no backups are running.

fdiaz
Level 3

I need to check with the DBA's to make sure that they are not running those maintenance schedules. If they were orphaned dbbacked.exe processes my ad-hoc backup would also fail. So it leads me to believe that the DBA's are doing something. Since have changed scheduling for some policies and they no longer fail.

 

I will also look at the vdi and error logs like suggested.

Marianne
Level 6
Partner    VIP    Accredited Certified

The SQL logs and Application log will report "other operations".

You can then go to SQL dba with the proof and negotiate schedules that will suite all parties.

Mark_Solutions
Level 6
Partner Accredited Certified

If your SQL Admins are running their own schedules then bear in mind that if you run transaction log backups during the day that for them to work the NetBackup Full backup must be the last one to have run.

If the Admins run a Full backup after NetBackup then your transaction log backups will fail as they do not have a valid Full backup to work from.

If you find VSS errors in the Windows event logs and they refer to VDI issues then check for VSS Rollup Packages from Microsoft as these exist for most systems and are not part of automatic Windows updates but do help on many occasions

fdiaz
Level 3

As everyone predicted I found a TLOG backup in the ERRORLOG almost at the same time as the differential backup was executed.

 

2011-10-22 05:30:01.11 spid98      Setting database option RECOVERY to FULL for database hdcvcl06042 .

 

 

2011-10-22 05:30:23.00 Backup      Log was backed up. Database: hdcvcl06042, creation date(time): 2011/10/08(01:14:13), first LSN: 7450143:1704:1, last LSN: 7450186:28:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'\\PATH\TRN_LOGSShared\HDCUNIPRDBSV026\TRN\hdcvcl06042_TRN_2011-10-22 05.30.00.497.bak'}). This is an informational message only. No user action is required.

 

 

What I do not understand is why only after upgrading to 7.1 from 6.5.5 has this been occuring. The DBA's have always perform the TLOG backups every 15 mins. Has the mechanism on how backup are queued changed from 655/71 or is it a coincidence?

Mark_Solutions
Level 6
Partner Accredited Certified

I dont believe it has changed that much so i would say it is a coincidence

If theyt do transaction logs all day then as far as your NetBackup jobs go you may as well just do your nightly full backups

Hope this has resolved things for you

fdiaz
Level 3

Talked to my DBA today and I mentioned the situation to him...He still does not understand why is it that a NetBackup SQL backup job does not wait for the TLOG backup to complete before its deamed a failure. He did tell me that he can stop/suspend TLOGs from running but only at a scheduled time or suspend if a backups is already running. Below is the log that I'm using to explain the situation...

1st - The DBA's have TLOG backups every 15 minutes

2nd - I perform fulls once a week and differentials daily except when a full runs. The reason is because we are constantly performing restore for RND purposes so it will not make sense to restore a full plus hundreds of TLOGs.

3rd - I still have quite a few SQL Clients that are running pre-7.0 Clients (which none of those backups fail)

 

What I'm noticing is that the TLOG backup runs at 05.30.00 and completes 23Sec later at 05:30:23. But  at 5:30:08 a NBU backup of the same DB is executed per policy and 3Sec later is fails. Is there a way to  increase the wait period before the backup is declared busy? For test purposes I'm going to uninstall the 7102 client and install a 655 client to see if it fails without making any changes to the policy of schedule.
 
NETBACKUP LOG
 
 
10/22/2011 5:30:11 AM - Info dbclient(pid=4484) ERR - Error <hdcvcl06042> is involved in another operation.  
 
10/22/2011 5:30:13 AM - Info dbclient(pid=4484) INF - OPERATION #1 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch FAILED with STATUS 3 (0 is normal). Elapsed time = 5(5) seconds.
 
 
SQL ERROLOG
2011-10-22 05:30:01.11 spid98      Setting database option RECOVERY to FULL for database hdcvcl06042 .
 
2011-10-22 05:30:23.00 Backup      Log was backed up. Database: hdcvcl06042, creation date(time): 2011/10/08(01:14:13), first LSN: 7450143:1704:1, last LSN: 7450186:28:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'\\SHARE_DIRECTORY\TRN_LOGSShared\HDCUNIPRDBSV026\TRN\hdcvcl06042_TRN_2011-10-22 05.30.00.497.bak'}). This is an informational message only. No user action is required.