Because many of my customers have unknown datas, I have wrote a DQL Query to identify file extensions which are not listed in filegroup.
The query is below and is working but I never try this request on a filer greater than1TB. The "WITH" Statement in SQL request used to find the last "." before extension may require a lots of CPU. What do you think of this report ?
GET basename, size
/* ADVANCED SQL QUERIES
CREATE table unknownExtension (path_rowid INTEGER, extension TEXT, size_MB INTEGER,filenum INTEGER);
WITH recursive reduce(path_rowid,extension,size ) as (
SELECT path_rowid,substr(basename,instr(basename,".")+1),size FROM path
SELECT path_rowid,SUBSTR(extension,instr(extension,".")+1),size FROM reduce
INSERT INTO unknownExtension
SELECT path_rowid, extension, SUM(size) AS size_MB,COUNT(extension)
group by extension
order by extension