Database Management job brings disk folders offline
I have just upgraded from Backup Exec from 2010 to 2012 and have been finding my backup to disk folders have been going offline for no apparent reason. On investigation, I am seeing that there are event logs that show the Virtual Disk service restarting at 4AM, which coincides with time the disks go offline.
It looks like the Backup Exec Database Maintenance routine runs at 4AM - I am guessing that this is somehow causing the problem. I cannot see how to modify the Maintenance routine, or why it would stop the Virtual Disk service.
Has anyone come across this before and if so, is there a resolution to it?
Thanks
Richard
It turns out it is not the database management job after all. Running the database job at any other time does not cause the drives to go offline, but the drives still go offline at 4AM. It took me about 5 hours to figure this one out. First, some info about my setup:
2 identical Dell 2950 servers, Server 2008R2 (64-Bit), Backup Exec 2012 SP1a, 2 Identical Dell MD300 arrays, directly attached. The drives go offline on one server, but not the other
The drives go offline 5 seconds after the database job ends at 4AM (04:00:11). It is usually the same drive that fails.
The log files in C:\Program Files\Symantec\Backup Exec\Logs that are created when the management job runs or the services restart have a time stamp that matches the failure. The only OFFLINE error message is in the InitVDS log file, so it is the Virtual Disk Service Initialization job that causes the failure.
The only difference between the two arrays I have is that the "Access" Virtual disk that gives access to the host group was using LUN 0, and not LUN 31 as specified by Dell, and was using the same LUN as one of the virtual disks. The backup servers are not live yet so I removed all of the virtual disks, hosts and host groups from the array and configured it again using the correct LUN for the "Access" disk. The drive no longer goes offline when the services are restarted. I believe the InitVDS job saw the access disk at LUN 0 and tried to initialize it. Now that it is not the first LUN, the job does not cause an error.
Symantec needs to patch that discovery procedure to deal with the possibility that the access partition will be scanned and not crash if it is.
I hope this helps you all and that this truly is the answer. It worked for me. I have a Case open with Symantec and I will pass this info on to them and hopefully they will put out a KB article on it.
Maybe they will send me a check for the time I spent troubleshooting their product. Now they don't have to spend the time doing it! :)
We will see if it goes offline at 4AM tomorrow morning. Cross your fingers....
Good Luck!