cancel
Showing results for 
Search instead for 
Did you mean: 

DO NOT Run Count(*) in your database

SQL_Squirrel
Not applicable

Recently there was a problem and a posted solution regarding upgrading Enterprise Vault.

My Infrastructure Architech and I started an upgrade last night that is still running. Our Infrastructure is as follows:-

SQL Server Failover Cluster 3 Instances (serving various applications, with SAN storage).

EVFSA Vault Store DB 77GB

Exchange Vault Store DB 52GB

Upgrade has been running now for 16HRS. One forum member suggested we run the following queries on teh Vaultstore databases to get an idea of how long the upgrade would take and determine if there might beany complications:-

Run the following in SQL:

Select count(*) from saveset

Select count(*) from journalarchive

And let us know the results....though if the two come back with similar numbers, I'd get back on the phone to support. Posted by Jeff Shutton 2

As a SQL Server DBA I would not recommend you run these queries on you database either during an upgrade or even if you suspect that teh size of the database may cause problems with your upgrade. These queries are executed in a batch and once they kick off they will count every (Row,Column) combination in your table. Count(*) counts everything and running it with another Count(*) query in the same batch will but an inordinate amount of load on teh Query processing engine.

If you want to know how mych rows are in teh table, a better query to run is Count(Unique_Item) from SaveSet as this would simply count the number of rows in the table.  The count(*) query I kicked off has now been running for 5 hours without returning a result. The Count(Unique_Item) completed in 2 minutes and returned a count of 47 Million for my EVFSA Vaqult Store Database.

Good luck with your upgrades

1 REPLY 1

Jeff_Shotton
Level 6
Partner Accredited Certified

So I'll admit that running the select count(*) during the upgrade was not the best idea. However, I don't think that running a query against a specific column is going to have improved things (seriously - and despite your results). More likely the upgrade had moved past the stage where that specific table was being accessed and therefore suffering from very heavy I/O.
If you check the execution plan in SQL (2005 or 2008) for the following:

Select count(*) from saveset

Select count(savesetidentity) from saveset

You will find that the execution plan is exactly the same. SQL server is determining there is a suitable index available and using it. It doesn't scan every column AND row. I used to think the same as you - querying specific columns is better, but now believe it not to be so (since SQL 2005 I've always used count(*) ). There are many people in both camps on this, but I'm sticking with what the execution plan says. Don't take my word for it - check it yourself.

http://stackoverflow.com/questions/1221559/count-vs-count1

http://www.bigresource.com/Tracker/Track-ms_sql-IoDwVqhv/

Secondly, those queries are run as a batch, but in series, not in parallel. The first one completes, the second one starts. You run these in query analyser and the top query will ALWAYS return a result before the second one.

For the purpose of this test though, assuming that your indexes are actually working, a MUCH better way of doing this would be to query the rowcount of the indexes directly. You can do this with the following queries:

select [rows] from sysindexes where id=OBJECT_ID('saveset') AND name = 'PK_Saveset'
select [rows] from sysindexes where id=OBJECT_ID('journalarchive') AND [name]= 'PK_JournalArchive'

These should take under a second.

ANYWAY,

However you choose to get your results, it does not escape the fact that it appears you also have a problem with your databases. Neither 77GB for an FSA vault store, or 52GB for a mailbox store is especially large. If you are running with reasonable hardware (4core, 16GB RAM, SAN storage) then you should be able to upgrade a LOT faster than that. Then again, you are sharing SQL for EV between other applications.

For EV8 > 9, the upgrade document advises 5 minutes per million rows in the journalarchive table, and FSA should not generate a large journalarchive table. You never stated the row counts in the mailbox database.
It's actually the journalarchive table which is really important. If you are never backing up the savesets on your storage device, this would result in a row in the journalarchive table for each saveset record. This data is supposed to be temporary and removed after a period of time (i.e when backup and indexing is complete). For mailbox archiving the data will hang around longer because it is also used for vault cache (and retained according to the transaction history setting on the site).
I've seen a fair few lengthy upgrades with EV9, and all of them so far have been caused by this issue. The reason being that in EV2007 the storage setting Remove safety copies 'immediately after archive' did not require a backup of the vault store partition. In EV8, backing up the vault store correctly is mandatory (not to create shortcuts - it's just the application requires this to operate properly internally).

The other thing is you might not be running maintenance to rebuild/reorganise the indexes. The SQL index fragmentation that occurs can cause big performance problems.

Finally, if you do find that you have loads of items requiring backup (many millions of entries in the watchfile\journalarchive table) then be aware that for approx every 1 million rows cleared down, around 3GB of SQL transaction log will be generated when you restart the storage service. Simple mode transaction logging in SQL does not help, since this is done in one huge transaction. Therefore you might be better off backing up chunks rather than the whole lot in one go if you are lacking storage space.


Regards,

Jeff