cancel
Showing results for 
Search instead for 
Did you mean: 

Database named "RollbackSnapshotTempDB" left after System Recovery 2013 backup

ammacdo
Level 3

We have a Windows Server 2008 R2 server with SQL Server 2008 R2 x64 installed. We have been using Backup Exec 2010 to perform Full and Transaction log backups of the databases in our SQL server, and System Recovery 2010 to backup the entire server. A couple weeks ago we upgraded System Recovery 2010 to 2013 and we noticed our Backup Exec 2010 jobs started failing. After investigating we discovered several databases named "RollbackSnapshotTempDB" were piling up. We tried to right-click the DB and go to properties but received an error message.

 

The SQL logs showed a lot of errors related to those "RollbackSnapshotTempDB" databases, such as:

 

Date  7/2/2013 6:07:37 AM
Log  SQL Server (Current - 7/16/2013 8:38:00 AM)

Source  spid400

Message
The operating system returned error 21(The device is not ready.) to SQL Server during a read at offset 0x00000053fc2000 in file '\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy11\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\XXXXXX.MDF'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

And:

Date  7/2/2013 6:07:37 AM
Log  SQL Server (Current - 7/16/2013 8:38:00 AM)

Source  spid400

Message
Error: 823, Severity: 24, State: 2.

If I go into the SQL Server Management Studio and and try to delete the "RollbackSnapshotTempDB" databse I receive an error that says "Cannot drop the database because it is being used for replication". However if I take the "RollbackSnapshotTempDB" database offline then I can delete it, and the next Backup Exec 2010 SQL backup job will complete successfully.

These errors seem to indicate a file system problem, however a CHKDSK shows no problems. We have rebooted the server as well with no change.

I have googled this issue until I'm blue in the face and have not found any solutions. Does anyone have any ideas?

 

Here are more details on the server:

Virtual Server running on ESXi 5.1 Update 1

Physical host is HP Proliant DL380 G6

Storage provided by HP LeftHand iSCSI SAN

Server 2008 R2 with all Windows Updates applied

SQL Server 2008 R2 with all Windows Updates applied

Server has two vCPU's and 12GB of RAM

C: drive has 21GB free

D: drive has 88GB free (D: is where all the SQL data files are stored)

40 REPLIES 40

Andreas_Horlach
Level 6
Employee Accredited

Is this error consistent with SSR backups? If you reset the SQL writer and start another backup, will it fail again?

Andreas_Horlach
Level 6
Employee Accredited

One more thing - can you check the application logs for source SQLWRITER and source SQLVDI tell me if you are getting any errors? 

SER-VIS_s_r_l_
Level 3

Hi!

For us, the workaround of disabling SSR2013 did the trick. We have SQL2008 SP3 instead of SQL2008 R2 so there some difference also related to database software version.

Just for information: Have you got some database in your environment set to "Compatibility Level: SQL Server 2005 (90)"? I've noticed that the database which has problems is the first in alphabetic order in that instance which has that compatibility level...
 Other instance, running on the same server, didn't shown the problem with SSR2013 and in that instance there are no db with compatibility level set to "SQL Server 2005".

ammacdo
Level 3

Sorry for the delay in getting back to this. The problem does consistently occur with SSR backups, and it also occurs with Backup Exec 2010 backups using AOFO.

I've also discovered something else. We have two databases setup with replication publications to sync the databases with a web host we use. If we remove all the replication settings and run a backup using SSR the RollbackSnapshotTempDB database does not get left behind. Once we re-create the replication publication then the problem starts back up.

So that seems to be the issue although I'm at a loss as to what to do about it. My work around for the past week has been to use some batch files to stop the SQL VSS Writer before SSR creates a snapshot and then to start the SQL VSS Writer service once the backup completes.

The only errors/warnings that show up in the event log are in my last post. There aren't any from the SQL VDI source.

Andreas_Horlach
Level 6
Employee Accredited

There is one test that you can run: Open Windows Server Manager and add the feature Windows Server Backup. Open Windows Backup and run a Backup Once... to create a one-time, full system-state backup to some desired location and see if there are any errors triggered on-screen or in the Windows application event logs. 

backup.jpg

Both AOFO and SSR 2013 are simply requesting the VSS framework, which is what Windows Backup does, so this test should get us a step closer to what's going on.

ammacdo
Level 3

It's funny you posted that because an hour ago I actually started a backup using the Windows backup utility. It completed successfully and I was coming back to report my findings. There was no RollbackSnapshotTempDB left over and no errors in either the SQL or Windows logs. So even though SSR is just using the Microsoft VSS Provider, there must be something that is different.

I can see in the SQL logs where the snapshots were created, very similar to how the logs look when SSR is running a backup, but there was a difference. When both the SSR and Windows backups are running you will see each of these log entries in the SQL log, one for every database on the server:

 

Date  7/26/2013 12:53:22 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid105

Message
I/O is frozen on database model. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.

 

Date  7/26/2013 12:53:23 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

 

Source  spid105

Message
I/O was resumed on database model. No user action is required.

 

 

Date  7/26/2013 12:53:25 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  Backup

Message
Database backed up. Database: model, creation date(time): 2003/04/08(09:13:36), pages dumped: 169, first LSN: 75:653:34, last LSN: 75:668:1, number of dump devices: 1, device information: (FILE=1, TYPE=VIRTUAL_DEVICE: {'{7289D6DB-5BCC-4154-8A68-918138C6D375}1'}). This is an informational message only. No user action is required.

 

With the Windows backup each database was reported as backed up by the last message and that was it, no more backup related messages in the SQL log. However when the SSR backup runs there are additional messages in the SQL log that look like this:

 

Date  7/26/2013 12:40:25 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid183

Message
Starting up database 'RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}'.

 

Date  7/26/2013 12:40:26 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid183

Message
CHECKDB for database 'RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}' finished without errors on 2013-07-26 09:53:44.937 (local time). This is an informational message only; no user action is required.

 

Date  7/26/2013 12:40:26 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid183

Message
Setting database option RECOVERY to SIMPLE for database RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}.

 

Date  7/26/2013 12:40:26 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid183

Message
Setting database option SINGLE_USER to ON for database RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}.

 

These 4 messages repeat a dozen or so times and then stop.  The name of the database that is shown in these messages matches the RollbackSnapshotTempDB database that is left over after SSR backup completes. After the above messages stop these errors start showing up:

 

Date  7/26/2013 12:48:09 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid5s

Message
Error: 17053, Severity: 16, State: 1.

 

Date  7/26/2013 12:48:09 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid5s

Message
LogWriter: Operating system error 21(The device is not ready.) encountered.

 

Date  7/26/2013 12:48:09 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid5s

Message
Write error during log flush.

 

Date  7/26/2013 12:48:09 PM
Log  SQL Server (Current - 7/26/2013 12:53:00 PM)

Source  spid92

Message
fcb::close-flush: Operating system error (null) encountered.

 

 

So all of the above messages are what show up in the SQL log. What follows are similar messages that show up in the Windows Application log. First these messages appear near the beginning of the SSR backup job:

 

Log Name:      Application
Source:        SQLWRITER
Date:          7/26/2013 12:40:35 PM
Event ID:      24583
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      servername
Description:
Sqllib error: OLEDB Error encountered calling ICommandText::Execute. hr = 0x80040e14. SQLSTATE: 42000, Native Error: 3724
Error state: 1, Severity: 16
Source: Microsoft SQL Server Native Client 10.0
Error message: Cannot drop the database 'RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}' because it is being used for replication.

 

Log Name:      Application
Source:        VSS
Date:          7/26/2013 12:40:35 PM
Event ID:      8229
Task Category: None
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      servername
Description:
A VSS writer has rejected an event with error 0x800423f4, The writer experienced a non-transient error.  If the backup process is retried,
the error is likely to reoccur.
. Changes that the writer made to the writer components while handling the event will not be available to the requester.

 

Finally, these messages show up shortly after the SSR job finishes and seem to correlate exacly with similar messages in the SQL log:

 

Log Name:      Application
Source:        MSSQLSERVER
Date:          7/26/2013 12:48:09 PM
Event ID:      5084
Task Category: Server
Level:         Information
Keywords:      Classic
User:          username
Computer:      servername
Description:
Setting database option OFFLINE to ON for database RollbackSnapshotTempDB{C4CF49CB-870B-47B5-BAE2-835956D9BB98}.

 

Log Name:      Application
Source:        MSSQLSERVER
Date:          7/26/2013 12:48:09 PM
Event ID:      17053
Task Category: Server
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      servername
Description:
LogWriter: Operating system error 21(The device is not ready.) encountered.

 

Log Name:      Application
Source:        MSSQLSERVER
Date:          7/26/2013 12:48:09 PM
Event ID:      17053
Task Category: Server
Level:         Error
Keywords:      Classic
User:          username
Computer:      servername
Description:
fcb::close-flush: Operating system error (null) encountered.


 

dwa
Level 2

Hello,

Just wanted to chime in that I'm seeing this same issue with SQL Server 2012. And Symantec System Recover 2013 does appear to be the culprit as the create time/date stamps on these databases are the time this product runs.

Our system admin says he has just installed an update to SSR but...

SQL Replication was enabled on this server also and there was one publication being subscribed to. As part of the cleanup however, replication was dropped, so I'm not sure that I've got a good test bed any longer - if this is somehow replication related.

My error messages look similar to those above.

I have contacted Microsoft support but for now only through the MS Partner program forums.

If I receive any information there that is useful I will pass it along.

criley
Moderator
Moderator
Employee Accredited

dwa,

Thanks for your input here. I'm just checking to see if you have received any feedback via the Microsoft forums?

dwa
Level 2

Hi Chris,

After a deeper look we were able to prove that the RollbackSnapshotTempDB database was created at the time that System Recovery 2013 was running. This pretty well proved the source of the temp DB's. Since I did not want to re-create this mess however, I did not go back and re-enable SQL replication on our server to see if the issue came back (it was no longer necessary).

The response from the MS Partner forum, after they looked overy my SQL Error log, was that they agreed with me that this tool was the cause but only recommended to check to see if the property that creates the temp DB's in SSR could be "turned off" - not much help.

I did not escalate this to a full MS Support incident which is probably what needs to be done to get this resolved. Sorry, but I think our system admin just decided not to use SSR 2013 since they had other tools available also.

SER-VIS_s_r_l_
Level 3

I've just opened the case with Symantec Support.

Waiting for some news...

criley
Moderator
Moderator
Employee Accredited

FYI - the following steps resolved the issue for a customer that I was working with:

1. Stop SSR service
2. Stop SQL Server VSS Writer service
3. Change the Logon account for the SQL Server VSS Writer service to use local admin account
4. Start SQL Server VSS Writer service
5. Start SSR service
6. Run SSR backup

SER-VIS_s_r_l_
Level 3

Today my case with Symantec Support about this issue was scaled to advanced support.

Now I tried to change the Logon account from "Local System account" to a domain user in local administrators group, waiting to tomorrow for update...

 

Andreas_Horlach
Level 6
Employee Accredited

I sent you a private message. Please let me know your case # through that message so I can track it. 

SER-VIS_s_r_l_
Level 3

Well... The workaround provided from Chris (configure the "SQL Server VSS Writer" service to logon using an admin account, also from domain) seems to work: there aren't "RollbackSnapshotTempDB" after the backup job of this night.

I've restored the old backup plans (full-system backup with SSR and db backup with BackupExec) and will test it for a while.

I'm also waiting to be contacted from advanced support.

criley
Moderator
Moderator
Employee Accredited

Thanks for the update.

I have published the following article which documents this:

http://www.symantec.com/docs/TECH210057

dwa
Level 2

Chris,

Our system admin applied your fix yesterday and I re-set up SQL Replication. After the SSR 2013 run this morning the problem has returned.

Are there any restarts of anything required in your procedure?

What version/service packs/etc. of SSR 2013 should we have?

Note that we are using SQL Server 2012.

Thanks.

criley
Moderator
Moderator
Employee Accredited

You should have SSR 2013 SP1 (help/about should show 11.0.1 for version details).

The server does not need to be restarted. Sorry, I don't know why that solution is not working for you.

SER-VIS_s_r_l_
Level 3

 Symantec second level support never contacted me (I was expecting a contact in the past week) but now the jobs of "System Recovery" no longer leave the temporary database so your fix seems to have worked.

Thank you!

SER-VIS_s_r_l_
Level 3

 The support case results as "Severity 2" and still "Under Investigation" but the Symantec second level support has never contacted me.
 Anyway, the issue was never happened again after assigning an administrative account to "SQL Server VSS Writer" service logon, as suggested from Chris.

 

[EDIT:]

Just spoken with Symantec second level support. I was controversial too soon!

Greetings.

;)
 

saikumar_beera
Not applicable

The solution does not work.

 

We have the same issue but we have the SQL VSS running under local admin account.  Changed it to use a domain service account but still same issue.

 

Also, I cannot delete the old RollbackSnapshotTempDBXXXX from SQL Management studio.

 

Can you suggest any solutions for this?

 

Thanks,

Sai