Solved: EV8 - Deleted items waiting to be deleted from ind...

stullier · ‎11-01-2013

Need some help here from folks that might know more about indexing than i do ;)

We recently upgraded our EV site servers from 7.5 sp4 to 8.0 sp5. The upgrade went very smoothly, and nearly everything is working without issue. However, we started seeing a Critical Alert in the console that corresponded to event 41022 in the event logs on one of our servers. Simply put, the system is seeing deleted items that have no been deleted from indexes.

Originally, there were 22 user's archives with uncommitted deletes as noted in the JournalDelete table of the Vault Store DB. A total of 245195 records. Just a few days ago, one of the archives apparently processed the data, because now there are 21 archives and 244805 records.

I have seen the record count in the JournalDelete table go up and come back to the same amount throughout the day as various operations take place, but the base number of records and associated archives have not changed, with the exception of that 1 archive that processed.

Most of those users had multiple index volumes with ranges of 1-0 and 1 - not set. I rebuilt those index volumes to correct any corruption on them.

The odd thing is that I can update or rebuild these user's indexes, and each time the process completes, it states that xxx items were deleted - which matches the count of items for that user in the JournalDelete table. If I run the update or rebuild again some time later, it says it again says the same number of items have been deleted.

I am currently working with support to try to isolate the cause of the matter, but we are not having any luck identifying the problem.

We have tried changing the SQL compatability down to SQL200 and back up to SQL 2005 (no SQL upgrade had occured, but they suggested trying it) with no luck.

I have also verified that the system is not in backup mode.

Any insights or suggestions would be greatly appreciated.

stullier · ‎11-01-2013

I think you hit on something.

I was following the steps you just mentioned, only I didn't archive a new message, but instead chose to delete a random existing archived item.

After deleting the item, within a few minutes, all of the records that were uncommitted in the JournalDelete table for that archive (almost 100,000 of them) changed to committed.

I repeated the process for another archive having the issue and got the same results.

It's almost as if the index is stuck and needs a bump.

View solution in original post

JesusWept3 · ‎11-01-2013

Your best bet is to contact Symantec Support.
I've seen a few people mention this exact same issue, and there was a "fix" of just setting "IndexCommited" to 1 in the JournalDelete table, which is a hack really, its barely a workaround and would say its a bad thing, because you may find yourself in the exact same position in a years time and not really solving anything.

You should probably start by Stopping the storage and indexing service, then starting a dtrace of IndexServer and StorageDelete, then start the services up and look to see what Storage Delete does when it first scans the JournalDelete table, it should try and push across to Indexing to remove the items, indexing should come back with an ok deleted and IndexCommitted gets flipped from a 0 to 1

and somewhere in that process is where its breaking down

https://www.linkedin.com/in/alex-allen-turl-07370146

stullier · ‎11-01-2013

JW3 - Thanks for the reply.

I've seen your other posts regarding similar issues and I agree that just setting the IndexCommitted to 1 is not a fix.

Like I mentioned, I do have a case open with support. I've sent them multiple dtrace files and a sql trace of indexing activity, but we have yet to identify where the actual "break down" is.

I figured I'd reach out to the community to see if you all could provide any other ideas, while support is trying to identify it.

JesusWept3 · ‎11-01-2013

ah sorry i missed that point....
Well here's something you could do

1. Archive an item that you can easily identify, something like an email called "Delete Me!"
2. Find the item in search.asp and copy out the link for the item, make a note of the SSID

so the link would look like this:
http://evServer1/EnterpriseVault/properties.asp?VaultID=17F8889A5BA1BAF4FB65879E1CAF79FB51110000evsite&PVID=11ACC86150C58CB48A59B2D6CDE37232B1110000evsite&SaveSetID=201303203815437%7E201303202126020000%7EZ%7EE0953804A2352B8BD440B08F92809891

3. Copy out the bit after "SavesetID =" and replace %7E with a tilde (~), so it looks like this
201303203815437~201303202126020000~Z~E0953804A2352B8BD440B08F92809891

4. Start a DTrace of IndexServer and Storage Delete
5. Delete the item through Search.asp or Archive Explorer
6. Query the JournalDelete like

SELECT SavesetID, DeletionDate, DeletionStatus, IndexCommitted
FROM JournalDelete
WHERE SavesetID = '201303203815437~201303202126020000~Z~E0953804A2352B8BD440B08F92809891'

6. Wait for IndexCommitted to go from 0 to 1
7. Stop the DTrace

Now in that DTrace you have the working process of an item being deleted and being removed from the index, just start off by looking for all instances of the savesetID and go from there
(i.e. in TextPad, do a search for the SSID, mark all lines, copy marked lines, paste in to a new file)
Then you can determine the thread that called it etc

And if it doesn't go to indexcomitted, you then have the trace showing that its new items not getting removed either

But if it does get removed, you can simply play spot the difference and find what path its going down and where one stops and where the other one carries on