08-17-2017 05:53 PM
Hi all,
Long-time listener, first-time caller.
Our customer is looking to turn on expiry in their environment for the first time. They use EV in a non-traditional manner, so of 32MM+ items, upwards of 2/3 (maybe even 9/10) are on legal hold via DA. Effectively the only items not on legal hold are from cases that were active but have since been closed, removing the holds. Many of the holds overlap, as well.
We have successfully tested expiry in a near-duplicate QA EV environment and removed about 2MM items without incident. After presenting that result, the customer's IT team has requested that we add an additional step to present to legal before advancing into Production. They'd like as granular a report as possible of what will be removed so they can confirm the data to be expired should be expired. Criteria are below:
After digging around the forums for quite some time, I've tested versions of @JesusWept3's query (which looks like it should do exactly what I need) but:
Please let me know if you have any tweaks, experience or advice to offer
08-18-2017 07:38 AM
I know you're looking for a SQL query, but have you considered using the index for this?
The ndte property represents the number of days until expiry for an item. You can run a search for all items that currently qualify for expiry by searching
ndte:<0
One advantage here is that it is easy to see the complete item-by-item details and previews using EV Search, as opposed to trying to scrounge what item properties are available in SQL (not many).
The other advantage is that if your item is subject to a Retention Plan, then the Retention Category listed on its Saveset record may not be the actual Retention Category against which its eligibility for expiry is calculated. The ndte index property takes this Retention Plan possibility into account and is therefore more accurate (provided your indexes are current, of course).
You can get environment-wide results by running the same search in Discovery Accelerator.
If you prefer to center the search around a specific date instead of number-of-days-until, you can also use the edat property to do that kind of search.
--Chris
08-18-2017 08:00 AM - edited 08-18-2017 08:01 AM
@ChrisLangevin wrote:I know you're looking for a SQL query, but have you considered using the index for this?
Hi Chris! First, thanks so much for your reply! I hadn't considered either of the properties you mentioned (SQL seemed a better fit off the top of my head) but am of course not opposed to anything that provides the end results meet the customer's needs. I should mention that prior to the review of these results, the customer has asked that all items with non-default retention categories be changed (via SQL update statement) to the default retention category, after which the indexes are being rebuilt.
The ndte property represents the number of days until expiry for an item. You can run a search for all items that currently qualify for expiry by searching
ndte:<0One advantage here is that it is easy to see the complete item-by-item details and previews using EV Search, as opposed to trying to scrounge what item properties are available in SQL (not many).
The other advantage is that if your item is subject to a Retention Plan, then the Retention Category listed on its Saveset record may not be the actual Retention Category against which its eligibility for expiry is calculated. The ndte index property takes this Retention Plan possibility into account and is therefore more accurate (provided your indexes are current, of course).
Since this is EV11, not EV12, Retention Plans are not going to be a factor. I'm wondering, though, if this language about retention plans would apply to Legal Holds. If a given item was subject to expiry but also blocked from expiry by DA Legal Holds, would the user's item(s) show as expirable using either ndte or edat?
You can get environment-wide results by running the same search in Discovery Accelerator.
If you prefer to center the search around a specific date instead of number-of-days-until, you can also use the edat property to do that kind of search.
DA seems like the better option, except for the fact that the transition from Prod DA into QA DA has left some searches unable to conclude the way one would hope (for example, one case has 18MM items and two weeks later, no legal holds have applied).
Again, many thanks for the quick reply!
08-18-2017 08:20 AM
The index properties ndte and edat just translate the item's current Retention Category or Plan into a concrete date. They will not account for expiry "override" options such as:
--Chris
08-18-2017 08:25 AM
@ChrisLangevin wrote:The index properties ndte and edat just translate the item's current Retention Category or Plan into a concrete date. They will not account for expiry "override" options such as:
- DA legal holds
- The "Delete expired items from this archive automatically" option on archive properties
- The "Prevent automatic deletion of expired items with this category" option on Retention Category properties
--Chris
Chris,
Thanks. That, unfortunately, is precisely my problem. The customer wants a proactive window into exactly what will (at least by per-user count) be expired out of the environment upon initiation of storage expiry. A list of all items that are eligible for expiry would be close to 20MM. A list of all items eligible for expiry that are NOT on legal hold would be literally an order of magnitude lower at less than 2MM. They want to review the 2MM.
Any thoughts based on that narrowed definition of "scope?"
Thanks again,
Dan
08-18-2017 08:57 AM
Dan,
Based on this further information, then it does seem we will need to use SQL to get this information. I ran it by our DA support staff and they suggested you open a DA support case and they will assist you with it. Best of luck.
--Chris
08-18-2017 09:40 AM
08-18-2017 09:45 AM
Hi @JesusWept3, didn't realize the link was broken. I used this link (query inline as well):
- Makes sure the user is enabled for expiry (enterpriseVaultDirectory.dbo.DeleteExpiredItems = 1)
- Discounts items in the SavesetHold table (if they're on hold, they would not be deleted)
- Links to the Archive table and not the ExchangeMailboxEntry table (so that it encompasses archives without users, shared archives etc)
SELECT A.ArchiveName, COUNT(DISTINCT(S.IdTransaction)) AS ItemsToExpire, RCE.RetentionCategoryNameFROM EnterpriseVaultDirectory.dbo.Archive A,
EnterpriseVaultDirectory.dbo.Root R,
EnterpriseVaultDirectory.dbo.RetentionCategoryEntry RCE,
EVVaultStore.dbo.Saveset S,
EVVaultStore.dbo.ArchivePoint AP,
EVVaultstore.dbo.HoldSaveset HSWHERE S.ArchivePointIdentity = AP.ArchivePointIdentity
AND AP.ArchivePointId = R.VaultEntryId
AND R.RootIdentity = A.RootIdentity
AND S.RetentionCategoryIdentity = RCE.RetentionCategoryIdentity
AND S.SavesetIdentity != HS.SavesetIdentity
AND S.ArchivePointIdentity != HS.ArchivePointIdentity
AND S.IdDateTime < DateAdd(year, -1, GetDate())
AND RCE.RetentionCategoryName = 'Business'
AND A.DeleteExpiredItems = 1GROUP BY A.ArchiveName, RCE.RetentionCategoryName
ORDER BY A.ArchiveName
08-18-2017 05:18 PM
I didn't see this pointed out specifically so I figured it's worth chiming in. @JesusWept3's script works in the lab where we have 1000 archived items and 150 on hold. However, in the QA environment with the customer's data, we have 30 million items with 24 million on hold (and multiple holds with lots of overlap) the query has been running for 48 hours and still hasn't completed. The QA has no other activity going on.
08-23-2017 11:01 AM
@JesusWept3:
Edited a bit for brevity and then refined to target a single test user with just 1100 items in their archive - all currently on hold - (using "where r.rootidentity = <user>") , the original query took a full 3.5 hours to complete.
I was able to get the same results, interestingly enough immediately by running the same query but editing:
and s.savesetidentity = hs.savesetidentity
to
and s.savesetidentity != hs.savesetidentity
Does that make sense to you?
Here was the query that returned immediate results:
SELECT A.ArchiveName, COUNT(DISTINCT(S.IdTransaction)) AS ItemsToExpire, RCE.RetentionCategoryName
FROM EnterpriseVaultDirectory.dbo.Archive A,
EnterpriseVaultDirectory.dbo.Root R,
EnterpriseVaultDirectory.dbo.RetentionCategoryEntry RCE,
EVVaultStore.dbo.Saveset S,
EVVaultStore.dbo.ArchivePoint AP,
EVVaultstore.dbo.HoldSaveset HS
WHERE R.RootIdentity = 5
AND S.ArchivePointIdentity = AP.ArchivePointIdentity
AND AP.ArchivePointId = R.VaultEntryId
AND R.RootIdentity = A.RootIdentity
AND S.RetentionCategoryIdentity = RCE.RetentionCategoryIdentity
AND S.SavesetIdentity = HS.SavesetIdentity
AND S.ArchivePointIdentity != HS.ArchivePointIdentity
-- AND S.IdDateTime < DateAdd(year, -1, GetDate())
-- AND RCE.RetentionCategoryName = 'Business'
-- AND A.DeleteExpiredItems = 1
GROUP BY A.ArchiveName, RCE.RetentionCategoryName
ORDER BY A.ArchiveName
08-23-2017 11:16 AM
Thanks @ChrisLangevin. Unfortunately, after opening a case and talking about it at some length, DA Support declined to pursue this in depth on advice from management.