Showing results for 
Search instead for 
Did you mean: 

DQL activity report data records

Level 1

Hello folks,

Recently I am doing a stress test about DI data colluection.

The environment is based on DI 6.1 and the source data is on a Netapp sharing folder from CIFS that I put all the activity behavior in the same folder.

I used below queries to get the customized result,

FROM activity

         path.permissions.type AS permission_type,
         formatedate(timestamp, "YYYY/MM/DD HH:mm:ss") AS event_time,
         operation AS event,
         path.absname AS file_path,
         rename_target.absname AS renamed_path,
         formatedate(path.last_accessed, "YYYY/MM/DD HH:mm:ss") AS last_accessed_time,
         formatedate(path.last_modified, "YYYY/MM/DD HH:mm:ss") AS last_modified_time,
         ipaddr AS client_ip,
         (path.size/1024) AS size_kb,
         path.type AS object_type,
         path.extensions AS file_type,
         formatedate(path.created_on, "YYYY/MM/DD HH:mm:ss") AS file_creation_time,
         path.msu.type AS Share_type

FORMAT path.permissions.type AS CSV

SORTBY timestamp DESC

 In order to test the data collection accuracy on this server, I compiled a python script to simulate high load file behaviorto check the records of data collections.

My python script is below,

import os
import sys
import shutil
import msvcrt
import glob
import time

path = imput('Please imput the assigned path?')
print ("Your path is",path)

print("Start date & time " + time.strftime("%c"))

for n in range(0,1600):
#create a new folder
#create 100 new file .txt and write contents
        for i in range(0,100):
                  fo = open("test"+str(i)+".txt","w+")
                  fo = write("DI is a good product\n")

#copy 50 files to new rename newfilex
        for j in range(0,50):

#rename the file from named test$ to Veritasfile$
        for k in range(0,50):

#move function
        for l in range(0,50):

#remove Veritas$ files
        for s in glob.glob("Veritas*.txt"):

#remove newfile$ files 
        for t in glob.glob("newfile*.txt"):

 #remove folder and included files

print("Finished date & time " + time.strftime("%c"))

 The strange thing is, after I ran the script for 1600 times, the total records not consistent with 1 unit run.

If I ran above for loop (n=1), I got the records in DI report is 602 rows.

Then, after I changed to 1600 loops, theoritically the total rows would be 602*1600 = 963200 but the total amount of data collected is 831554

I would whether so much data is dropped event event I found the first several times records are accordance with 602 per unit.

I'm not sure if it's related with Netapp async mode?

I ran the report after the script finished for around 10 hours.

Could you please give me some suggestions?