DQL "IF" vs "WHERE" - what's the difference? Why?

Hi all

First time asking a DQL quesiton, thanks for having an active forum and an opportunity to both ask and contribute.

I've been all over the Programmers Refernce Guide, and while "IF" is mentioned as part of the DQL syntax, the obvious SQL-like "WHERE" isn't ... or at least explaining what's best to use in what sort of situations.

to explain further: On one of the VOX discussions a member posted 

FROM path
GET absname, device.name, permissions.readable_permission
IF issensitive = 1

someone trying to help, suggested, amonst other changes, using WHERE instead of IF

WHERE last_accessed < datetime("2017/05/01 00:00", "YYYY/MM/DD HH:mm")
AND issensitive = 1 AND type = "FILE" AND matchi(parent.name, "*finance*") = 1 AND parent.type = "DIR"

(ignore the specific arguments)
https://vox.veritas.com/t5/Data-Insight/DQL-report-to-show-sensitive-files-user-activities-for-a-giv...

So, is the choice of using WHERE (instead of "IF") simply

  • the number of arguments are too many for "IF"? 
  • Is it a speed/indexing thing? (full table scans/query execution plans, etc)
  • "IF" is legacy syntax and "WHERE" is current practice?

or something else?

coming from ANSI SQL, I'd be more comfortable using "WHERE", but if "IF" has different capabilities, it'd be very handy to know about it.

thanks for your help.

Cheers
Barry

(PS: I've not seen an example that had both "WHERE" and "IF" in the same DQL command ... just curious...)

1 Solution

Accepted Solutions
Accepted Solution!

Re: DQL "IF" vs "WHERE" - what's the difference? Why?

Barry, this is a great question and use of the forum.

Let us start by confirming for you that DataInsight Query Language (DQL) is a hybrid  gleened from possible sqlite commands and not actually SQL being used, although as you have seen you can apply the normal SQL commands in the advanced options section of the query you are building. As such  DQL does not have the full command set functionality. You are correct to look to the Veritas_Data_Insight_SDK_Guide document for the instruction on usage of the commands included in the DQL offering.

Based on the guide and referencing your question on whether there is a significant difference or advantage to the usage of either function within a query let me quote directly from the latest version:

ref: page 34

IF clause
The IF clause is an optional clause that you can use to specify a set of conditions on the rows that you want to retrieve. It is similar to the WHERE clause of SQL.
DQL retrieves only those rows whose columns satisfy the condition(s) provided under the IF clause.

One major advantage you will notice as you peruse the information is the ability to use:

Conditions on multi-valued columns
You can use EACH or ANY prefixes to specify the conditions on multi-valued
columns. EACH specifies that each value of the multi-valued column should satisfy
the condition while ANY specifies that any value of the multi-valued column should
satisfy the condition.
Suppose that you want to retrieve only those paths on which the user John is active.
You can write a query as shown below.


FROM path
GET name, active_users.name
IF ANY active_users.name = "John"
FORMAT active_users AS CSV;


Suppose that you want to retrieve paths on which either John or Joe are active.
You can write a query (query a) as shown below.

FROM path
GET name, active_users.name
IF ANY active_users.name IN ("John","Joe")
FORMAT active_users AS CSV;

As you have seen the guide does quote usage of where but does not include it in the

DQL Query Syntax <pg 30>
The DQL query syntax and top-level grammatical constructs are as shown:
FROM <table>
GET <column expression> [AS alias], <column expression> [AS alias], ...
[IF <condition>]
[USING <definition>]
[FORMAT <column> AS (CSV|TABLE <tablename>) [<count>]]
[GROUPBY <column expression>, <column expression>, ...]
[HAVING <aggregate-condition>]
[SORTBY <column expression> [ASC|DESC]]
[LIMIT [<offset>,]<count>];

I have seen WHERE used in the fashion we would expect of SQL to filter on values that must be true and combine conditions to make a granular choice whereas I would not expect IF being able return the same value. The best suggestion would be to filter in DQL using IF and then condition your output report with Advanced SQL using the WHERE clause.  Although to be honest I really just have not had the time to dedicate to test it in order to respond to you with certainty.

I would expect you could use a SQL conditional such as:

from Table select Colum where (concat in(select concat from Table group by concat having (count(concat)>1))) order by Value-size desc, Value-basename asc;

Note: Advanced SQL section of query; Assumes definitions not included for brievity.

I would not expect much difference in DQL using either of the IF or WHERE on simple filters like:

if type = 'FILE' and isdeleted = 0

where type = 'FILE' and isdeleted = 0

I suggest you take your DQL query and interchange the IF and WHERE to test for yourself but if you are going to format the database created from the DQL query using SQL for presentation stick with your comfort zone and use WHERE.

hopefully you find that of some use.

Rod

View solution in original post

3 Replies
Accepted Solution!

Re: DQL "IF" vs "WHERE" - what's the difference? Why?

Barry, this is a great question and use of the forum.

Let us start by confirming for you that DataInsight Query Language (DQL) is a hybrid  gleened from possible sqlite commands and not actually SQL being used, although as you have seen you can apply the normal SQL commands in the advanced options section of the query you are building. As such  DQL does not have the full command set functionality. You are correct to look to the Veritas_Data_Insight_SDK_Guide document for the instruction on usage of the commands included in the DQL offering.

Based on the guide and referencing your question on whether there is a significant difference or advantage to the usage of either function within a query let me quote directly from the latest version:

ref: page 34

IF clause
The IF clause is an optional clause that you can use to specify a set of conditions on the rows that you want to retrieve. It is similar to the WHERE clause of SQL.
DQL retrieves only those rows whose columns satisfy the condition(s) provided under the IF clause.

One major advantage you will notice as you peruse the information is the ability to use:

Conditions on multi-valued columns
You can use EACH or ANY prefixes to specify the conditions on multi-valued
columns. EACH specifies that each value of the multi-valued column should satisfy
the condition while ANY specifies that any value of the multi-valued column should
satisfy the condition.
Suppose that you want to retrieve only those paths on which the user John is active.
You can write a query as shown below.


FROM path
GET name, active_users.name
IF ANY active_users.name = "John"
FORMAT active_users AS CSV;


Suppose that you want to retrieve paths on which either John or Joe are active.
You can write a query (query a) as shown below.

FROM path
GET name, active_users.name
IF ANY active_users.name IN ("John","Joe")
FORMAT active_users AS CSV;

As you have seen the guide does quote usage of where but does not include it in the

DQL Query Syntax <pg 30>
The DQL query syntax and top-level grammatical constructs are as shown:
FROM <table>
GET <column expression> [AS alias], <column expression> [AS alias], ...
[IF <condition>]
[USING <definition>]
[FORMAT <column> AS (CSV|TABLE <tablename>) [<count>]]
[GROUPBY <column expression>, <column expression>, ...]
[HAVING <aggregate-condition>]
[SORTBY <column expression> [ASC|DESC]]
[LIMIT [<offset>,]<count>];

I have seen WHERE used in the fashion we would expect of SQL to filter on values that must be true and combine conditions to make a granular choice whereas I would not expect IF being able return the same value. The best suggestion would be to filter in DQL using IF and then condition your output report with Advanced SQL using the WHERE clause.  Although to be honest I really just have not had the time to dedicate to test it in order to respond to you with certainty.

I would expect you could use a SQL conditional such as:

from Table select Colum where (concat in(select concat from Table group by concat having (count(concat)>1))) order by Value-size desc, Value-basename asc;

Note: Advanced SQL section of query; Assumes definitions not included for brievity.

I would not expect much difference in DQL using either of the IF or WHERE on simple filters like:

if type = 'FILE' and isdeleted = 0

where type = 'FILE' and isdeleted = 0

I suggest you take your DQL query and interchange the IF and WHERE to test for yourself but if you are going to format the database created from the DQL query using SQL for presentation stick with your comfort zone and use WHERE.

hopefully you find that of some use.

Rod

View solution in original post

Highlighted

Re: DQL "IF" vs "WHERE" - what's the difference? Why?

Thanks heaps, Rod.

So far, I've been "cheating" and getting away with the WHERE clause in the DQL, without incident.

I'm essentially kicking the tyres and seeing how fast this baby goes around corners. 

The highlighting of the "EACH" and "ANY" clauses can be particularly valueable, especially when looking for a specific value within a list.

eg: 

[path].[filegroups]     /* [String] List of filegroups for this path.*/



I haven't yet looked into using the "Advanced Query" SQLite additional processing. My initial thinking (before this deep dive into DQL) was to export what I need to know as CSV files into an SQLServer database and join from there.

I'm interrogating 110Tb of data to see what is worthwhile to migrate, and what's junk that can be disposed of, or archived in a dedicated repository. I'm careful not to "bite off" too much per report.

The "junk/keep" critieria is so complex and fluid I'm not expecting to perfectly craft it all within one DQL report - I'll be joining [PATH], [ACTIVITY], [USERS] and a couple of other objects to get a 360 degree view of the situation for content owners to decide what to do with their stuff.

So, thanks again for your help. Cheers

Barry

Re: DQL "IF" vs "WHERE" - what's the difference? Why?

Barry you have managed to tweak my interest and make your project sound super interesting.

It will likely be a trial and error thing at first until you reach a comfort level. Let me welcome you to Vox and encourage you to keep posting.

If you have queries you are questioning feel free to use the community for assistance we have some great people on here.


Rod