Solved: Talked to my DBA today and I

fdiaz · ‎10-22-2011

Is anyone experiencing SQL 2005/2008 DB backup failures with ERR - Error <DATABASENAME> is involved in another operation. I have been upgrading all the SQL Clients to 7.1.0.2 from 6.5.5/7.0 and then begin experiencing failures during incremental scheduled backups. Later on during the day when the failed backups are reconciled the back successfuly complete. I have tried changing the scheduling but it did not work. Anyone have any ideas on what logs to enable from NetBackup or SQL?

10/22/2011 5:30:00 AM - Info nbjm(pid=4808) starting backup job (jobid=468647) for client hdcuniprdbsv026.backup.hdc.mdsol.com, policy HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D, schedule Daily_Differentials

10/22/2011 5:30:00 AM - Info nbjm(pid=4808) requesting MEDIA_SERVER_WITH_ATTRIBUTES resources from RB for backup job (jobid=468647, request id:{23F54E2D-55E1-4E88-8221-85CB7ACE0AF8})

10/22/2011 5:30:00 AM - requesting resource PROD_DD_STU2

10/22/2011 5:30:00 AM - requesting resource hdcbackup03.NBU_CLIENT.MAXJOBS.hdcuniprdbsv026.backup.hdc.mdsol.com

10/22/2011 5:30:00 AM - requesting resource hdcbackup03.NBU_POLICY.MAXJOBS.HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D

10/22/2011 5:30:00 AM - granted resource hdcbackup03.NBU_CLIENT.MAXJOBS.hdcuniprdbsv026.backup.hdc.mdsol.com

10/22/2011 5:30:00 AM - granted resource hdcbackup03.NBU_POLICY.MAXJOBS.HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D

10/22/2011 5:30:00 AM - granted resource PROD_DD_STU2

10/22/2011 5:30:00 AM - estimated 0 Kbytes needed

10/22/2011 5:30:00 AM - Info nbjm(pid=4808) started backup job for client hdcuniprdbsv026.backup.hdc.mdsol.com, policy HDCUNIPRDBSV026-SQL-NA-PROD-SLV-D, schedule Daily_Differentials on storage unit PROD_DD_STU2

10/22/2011 5:30:01 AM - started process bpbrm (2720)

10/22/2011 5:30:03 AM - Info bpbrm(pid=2720) hdcuniprdbsv026.backup.hdc.mdsol.com is the host to backup data from

10/22/2011 5:30:03 AM - Info bpbrm(pid=2720) reading file list from client

10/22/2011 5:30:03 AM - connecting

10/22/2011 5:30:06 AM - Info bpbrm(pid=2720) starting bphdb on client

10/22/2011 5:30:06 AM - Info bphdb(pid=4028) Backup started

10/22/2011 5:30:06 AM - connected; connect time: 00:00:03

10/22/2011 5:30:11 AM - Info dbclient(pid=4484) ERR - Error <hdcvcl06042> is involved in another operation.

10/22/2011 5:30:13 AM - Info dbclient(pid=4484) INF - OPERATION #1 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch FAILED with STATUS 3 (0 is normal). Elapsed time = 5(5) seconds.

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - BACKUP STARTED USING

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64)

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Jun 17 2011 00:54:03

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Copyright (c) Microsoft Corporation

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor)

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) Batch = HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch, Op# = 2

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - Using backup image hdcuniprdbsv026.MSSQL7.HDCUNIPRDBSV026.inc.hdcvcl07018.~.7.001of001.20111022053015..C

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - backup database "hdcvcl07018" to VIRTUAL_DEVICE='VNBU0-4484-4652-1319261417' with stats = 10, blocksize = 65536, maxtransfersize = 4194304, buffercount = 2, differential

10/22/2011 5:30:17 AM - Info dbclient(pid=4484) INF - Number of stripes: 1, Number of buffers per stripe 2.

10/22/2011 5:30:18 AM - Info dbclient(pid=4484) INF - Created VDI object for SQL Server instance <hdcuniprdbsv026>. Connection timeout is <300> seconds.

10/22/2011 5:30:52 AM - Info dbclient(pid=4484) INF - Thread has been closed for stripe #0

10/22/2011 5:31:04 AM - Info dbclient(pid=4484) INF - OPERATION #2 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch SUCCEEDED with STATUS 0 (0 is normal). Elapsed time = 49(0) seconds

Mark_Solutions · ‎10-25-2011

I dont believe it has changed that much so i would say it is a coincidence

If theyt do transaction logs all day then as far as your NetBackup jobs go you may as well just do your nightly full backups

Hope this has resolved things for you

View solution in original post

Marianne · ‎10-23-2011

NBU: dbclient (under ...\veritas\netbackup\logs) and progress logs in ...\veritas\netbackup\logs\user_ops\mssql\logs.

SQL: SQL error logs (ERRORLOG usually located in <install_path>\mssql\log), the VDI log (vdi.log usually in <install_path>\mssql\log), and the Application Event Viewer log.

Handy NetBackup Links

Indjev · ‎10-23-2011

I see this when TX and DB backup runs in the same time.

I got this error when I had an old vnetd (maybe hung) process.

Restart Netbackup services and related processes and try again.

If you have SQL admin, ask them about some internal backups that runs in the same timeframe

Mark_Solutions · ‎10-23-2011

Check that your SQL admins are not running SQL Maintenace schedules and doing their own backups which can conflict.

Also, if there been any problems with backups, check that there are no any orphaned dbbacked.exe processes running when no backups are running.

fdiaz · ‎10-23-2011

I need to check with the DBA's to make sure that they are not running those maintenance schedules. If they were orphaned dbbacked.exe processes my ad-hoc backup would also fail. So it leads me to believe that the DBA's are doing something. Since have changed scheduling for some policies and they no longer fail.

I will also look at the vdi and error logs like suggested.

Marianne · ‎10-23-2011

The SQL logs and Application log will report "other operations".

You can then go to SQL dba with the proof and negotiate schedules that will suite all parties.

Handy NetBackup Links

Mark_Solutions · ‎10-24-2011

If your SQL Admins are running their own schedules then bear in mind that if you run transaction log backups during the day that for them to work the NetBackup Full backup must be the last one to have run.

If the Admins run a Full backup after NetBackup then your transaction log backups will fail as they do not have a valid Full backup to work from.

If you find VSS errors in the Windows event logs and they refer to VDI issues then check for VSS Rollup Packages from Microsoft as these exist for most systems and are not part of automatic Windows updates but do help on many occasions

fdiaz · ‎10-24-2011

As everyone predicted I found a TLOG backup in the ERRORLOG almost at the same time as the differential backup was executed.

2011-10-22 05:30:01.11 spid98 Setting database option RECOVERY to FULL for database hdcvcl06042 .

2011-10-22 05:30:23.00 Backup Log was backed up. Database: hdcvcl06042, creation date(time): 2011/10/08(01:14:13), first LSN: 7450143:1704:1, last LSN: 7450186:28:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'\\PATH\TRN_LOGSShared\HDCUNIPRDBSV026\TRN\hdcvcl06042_TRN_2011-10-22 05.30.00.497.bak'}). This is an informational message only. No user action is required.

What I do not understand is why only after upgrading to 7.1 from 6.5.5 has this been occuring. The DBA's have always perform the TLOG backups every 15 mins. Has the mechanism on how backup are queued changed from 655/71 or is it a coincidence?

Mark_Solutions · ‎10-25-2011

I dont believe it has changed that much so i would say it is a coincidence

If theyt do transaction logs all day then as far as your NetBackup jobs go you may as well just do your nightly full backups

Hope this has resolved things for you

fdiaz · ‎10-27-2011

Talked to my DBA today and I mentioned the situation to him...He still does not understand why is it that a NetBackup SQL backup job does not wait for the TLOG backup to complete before its deamed a failure. He did tell me that he can stop/suspend TLOGs from running but only at a scheduled time or suspend if a backups is already running. Below is the log that I'm using to explain the situation...

1st - The DBA's have TLOG backups every 15 minutes

2nd - I perform fulls once a week and differentials daily except when a full runs. The reason is because we are constantly performing restore for RND purposes so it will not make sense to restore a full plus hundreds of TLOGs.

3rd - I still have quite a few SQL Clients that are running pre-7.0 Clients (which none of those backups fail)

What I'm noticing is that the TLOG backup runs at 05.30.00 and completes 23Sec later at 05:30:23. But at 5:30:08 a NBU backup of the same DB is executed per policy and 3Sec later is fails. Is there a way to increase the wait period before the backup is declared busy? For test purposes I'm going to uninstall the 7102 client and install a 655 client to see if it fails without making any changes to the policy of schedule.

NETBACKUP LOG

10/22/2011 5:30:11 AM - Info dbclient(pid=4484) ERR - Error <hdcvcl06042> is involved in another operation.

10/22/2011 5:30:13 AM - Info dbclient(pid=4484) INF - OPERATION #1 of batch HDCUNIPRDBSV026-SQL-NA-PROD-SLV-ALL-DB-D.bch FAILED with STATUS 3 (0 is normal). Elapsed time = 5(5) seconds.

SQL ERROLOG

2011-10-22 05:30:01.11 spid98 Setting database option RECOVERY to FULL for database hdcvcl06042 .

2011-10-22 05:30:23.00 Backup Log was backed up. Database: hdcvcl06042, creation date(time): 2011/10/08(01:14:13), first LSN: 7450143:1704:1, last LSN: 7450186:28:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'\\SHARE_DIRECTORY\TRN_LOGSShared\HDCUNIPRDBSV026\TRN\hdcvcl06042_TRN_2011-10-22 05.30.00.497.bak'}). This is an informational message only. No user action is required.

VOX

Failed SQL Backup due to <DBName> is involved in another operation