08-23-2017 01:08 AM
Because many of my customers have unknown datas, I have wrote a DQL Query to identify file extensions which are not listed in filegroup.
The query is below and is working but I never try this request on a filer greater than1TB. The "WITH" Statement in SQL request used to find the last "." before extension may require a lots of CPU. What do you think of this report ?
FROM path GET basename, size IF extension='' AND type='FILE' AND substr(basename,'.')=1 SORTBY basename
/* ADVANCED SQL QUERIES
CREATE table unknownExtension (path_rowid INTEGER, extension TEXT, size_MB INTEGER,filenum INTEGER); WITH recursive reduce(path_rowid,extension,size ) as ( SELECT path_rowid,substr(basename,instr(basename,".")+1),size FROM path UNION ALL SELECT path_rowid,SUBSTR(extension,instr(extension,".")+1),size FROM reduce WHERE INSTR(extension,".")>0 ) INSERT INTO unknownExtension SELECT path_rowid, extension, SUM(size) AS size_MB,COUNT(extension) FROM reduce WHERE INSTR(extension,".")=0 group by extension order by extension