As briefly mentioned, having a lot of msg's being returned running classification for 'UK PII'. The customer is treating them as "false positives" as it does appear to be practically capturing all email msgs!
Firstly, can you clear up how the UK PII Policy should trigger the tag please? I'm reading it as it needs to have...
1) 'Any of' - 2 or more - DOB, email address, UK postal address.
2) 'Any of' - CC, UK Driving License, Electoral Reg, NHS no, Nation Insurance no, passport no, Tax ref, etc...
My understanding is that BOTH these conditions need to be met for it to tag it as "PII" but it appears by reviewing a few tagged files and testing an msg file it does seem to trigger PII with just 1) - e.g. my test msg have 3 email addresses and 2 UK postal addresses.
There was NO elements from 2).
If so, this seems a bit simplistic as practically every email that's been forwarded or replied to has these elements! I'm hoping you have a simple fix or I'm missing something.
I haven't put the latest CHF on these environments (it's using base 6.1).
Solved! Go to Solution.
It's 1 OR 2 (not 1 AND 2). Note the root 'Any of' that joins 1 and 2.
But also note that the email address condition requires 10 or more email addresses.
So a document with 3 email addresses and 2 postal addresses should not get a match.
I suggest changing the confidence on the email condition to (very high to very high) instead of the default (high to very high) as this will then require extra keywords to be present intended to identify email addresses used in documents (non email) vs just email addresses in email (too common).
and/or increase the number of email addresses required to 1000.