DQL activity report data records
Hello folks,
Recently I am doing a stress test about DI data colluection.
The environment is based on DI 6.1 and the source data is on a Netapp sharing folder from CIFS that I put all the activity behavior in the same folder.
I used below queries to get the customized result,
FROM activity
GET user.name,
user.domain,
path.permissions.type AS permission_type,
formatedate(timestamp, "YYYY/MM/DD HH:mm:ss") AS event_time,
operation AS event,
path.absname AS file_path,
rename_target.absname AS renamed_path,
formatedate(path.last_accessed, "YYYY/MM/DD HH:mm:ss") AS last_accessed_time,
formatedate(path.last_modified, "YYYY/MM/DD HH:mm:ss") AS last_modified_time,
ipaddr AS client_ip,
(path.size/1024) AS size_kb,
path.type AS object_type,
path.extensions AS file_type,
formatedate(path.created_on, "YYYY/MM/DD HH:mm:ss") AS file_creation_time,
path.msu.type AS Share_type
FORMAT path.permissions.type AS CSV
SORTBY timestamp DESC
In order to test the data collection accuracy on this server, I compiled a python script to simulate high load file behaviorto check the records of data collections.
My python script is below,
import os
import sys
import shutil
import msvcrt
import glob
import time
path = imput('Please imput the assigned path?')
print ("Your path is",path)
os.chdir(path)
print("Start date & time " + time.strftime("%c"))
for n in range(0,1600):
#create a new folder
os.mkdir("test_folder)
#create 100 new file .txt and write contents
for i in range(0,100):
fo = open("test"+str(i)+".txt","w+")
fo = write("DI is a good product\n")
fo.close()
#copy 50 files to new rename newfilex
for j in range(0,50):
shutil.copy("test"+str(2*j)+".txt","newfile"+str(2*j+1)+".txt")
#rename the file from named test$ to Veritasfile$
for k in range(0,50):
os.rename("test"+str(2*k)+".txt","Veritas"+str(k)+"123825147.txt")
#move function
for l in range(0,50):
os.rename("test"+str(l*2+1)+".txt","test_folder/test"+str(l*2+1)+".txt")
#remove Veritas$ files
for s in glob.glob("Veritas*.txt"):
os.remove(s)
#remove newfile$ files
for t in glob.glob("newfile*.txt"):
os.remove(t)
#remove folder and included files
shutil.rmtree("test_folder")
time.sleep(120);
print("Finished date & time " + time.strftime("%c"))
The strange thing is, after I ran the script for 1600 times, the total records not consistent with 1 unit run.
If I ran above for loop (n=1), I got the records in DI report is 602 rows.
Then, after I changed to 1600 loops, theoritically the total rows would be 602*1600 = 963200 but the total amount of data collected is 831554
I would whether so much data is dropped event event I found the first several times records are accordance with 602 per unit.
I'm not sure if it's related with Netapp async mode?
I ran the report after the script finished for around 10 hours.
Could you please give me some suggestions?