When combining DI with Enterprise Vault, EV is using “alternate data streams” and writes a lot of hidden control files (i.e. evarchivepoint.xml, evfolderpoint.xml) into the file system. They can be seen by Data Insight, which is a known issue of Data Insight.
Currently we configured exclude rules for e.g. evarchivepoint.xml, evfolderpoint.xml, ~$evfsatemp*.tmp, this seems to be a workaround for new scans. But we had an original scan without such filter/exclude rules and the files still show up in DI. They are not removed.
How can these files be deleted from the database? The documentation shows nothing about database manipulation. There are some CLI-files in the Data Insight directory, but i found nothing to easily find and delete database entries.
We would require a support case for this issue as once a file makes it into the indices for the Symantec DataInsight (SDI) product the method for removal is aging as per the data retention policy in place. This policy removes the access events and not the files themselves.
We would likely need a Product Action Request (PAR) which equates to an enhancement to the product design. Meanwhile I will run some testing in preparation for the case and see what the limitation would be to removing specific files from specific shares stored on an indexer in the indices.
My assumption is that you are using exclusion rules that equate to pattern matching and not on the directory structure itself. I would expect a scan-resync to clear out the path hierarchy if you had excluded the root folder form scanning. I will also see if there have been any other requests for similar removals as likely the only current method would be manipulation of the databases themselves to delete the record containing the file reference object you see when looking in the workspace of the Console.