01-08-2014 02:24 AM
Hi,
i have a customer who implemented HCP in environment. We have installed Streamer app on the EV server , and created new partition to point to new HCP.
Now Everything work's ok (Archive/Retriev/Restore) but i receive this event:
Storage File Watch - 28967
Watch file partition scan will be stopped for fol owing partition since 50 consecutive unsecured (not replicated or backed up) files/streams were found.
Partition Name = HCP1
Also i receive Error that 199572 savesets have not yet been backed up or replicated.
The Storage guy tells me that replication works ok.
I would be grateful if someone could point us in some direction towards troubleshooting.
here is the partition properties.
Solved! Go to Solution.
01-09-2014 03:51 PM
Pixa
Local and remote safety copies are both set to '1' in my case.
Certainly if you can connect to your HCP's, pause your archiving, and then watch the item counters, the replicas will be at least five minutes before it catches up and all data is replicated, and in my case, they always end up identical and no errors shown at all.
If you DTrace this, you'll see entries such as this, 50 of them for each partition you look at before it gives up and chucks an error out to the event log:
If you can get hold of the Hitachi migration tool, you can connect to the file system and go and have a look and just check that the data is present on the replica, in my case after looking at this over and over, I can't find anything wrong and the events logged, whilst alarming and no doubt going to upset and confuse, seem at this point to be harmless information and a bit of a red herring, but I have thus far been unable to have that confirmed
01-16-2014 12:23 AM
Hi,
i would like to confirm that setting Local and remote safety copies both to '1' resolved problems regarding Saveset not being backed up error.
It's strange that by default this setting is 2 and i think it should be 1 :)
Regards.
01-08-2014 04:46 AM
Pixa,
What SP level is EV 10?
Have you run a Dtrace of StorageFileWatch and then gone in and out of backup mode to see what the Dtrace is showing for that process?
What is the version of streamer software installed?
Regards,
Patrick
01-08-2014 04:52 AM
havent run the dtrace yet, i'll get that and get it back here.
EV is 10.0.4, streamer 1.1.12
01-08-2014 04:58 AM
Thanks for the update. Wanted to make sure the SP level and that there was no known issues.
01-08-2014 06:22 AM
Hi, here is dtrace, i couldn't find any problems within.
01-08-2014 08:36 AM
Pixa,
From the Dtrace I can see the following for all items checked-
Line containing -
CStreamerItemSecuredVerifier::CheckItem Information: Checked stream status for StoreIdentifier
Safe = [0]
This indicates the item is not secured. If it was seen as secured it should read-
Safe = [1]
Can you have the Storage team verify the items that show in those lines...like
evroot/314F41DE3277D64F9CADFB64D053C8A4/2014/Jan/07/427f0ea2-7fb3-4f2a-92eb-fac3cfa43a91/09_52_15/00001
Patrick
01-08-2014 07:17 PM
This was interesting to read just now, I have the same configuration at my customer, I see the exact same messages, they can appear at random during the day, but always appear after you take EV out of backup mode.
Event ID: 28967
"Watch file partition scan will be stopped for the following partition since 50 consecutive unsecured (not replicated or backed up) files/streams were found."
I have access to both HCP's, production and replica, I placed EV tasks into report mode to allow it to process everything without adding any new items, I had a look at both HCP's and each showed identical item counts after a few minutes to allow replication to complete. From the EV side, all unsecured item counts were then zero following the next scan of the partition, (by default this might be 60 minutes although I changed mine to 30) so after weeks of looking at this and getting nowhere, I have concluded that this might just be normal EV chatter and could well just be a red herring, caused by EV writing to the production HCP, then checking for replicaiton to the second HCP, before replication had been completed.
But if anyone knows better and can suggest some new avenue to check, I'd be happy to join in and help diagnosis because I can reproduce this error at will and its been bugging me for weeks.
01-09-2014 12:03 AM
What are the values that you have in Advanced tab on the partition properties for:
Local safety copies
Remote safety copies
01-09-2014 05:07 AM
Yes, this could very well be timing between when EV checks the items and when it is actually replicated. Even with a few minutes lag that could be enough time to hit the 50 consecutive items.
From the Dtrace I reviewed the items did not seem to be that old.
01-09-2014 03:51 PM
Pixa
Local and remote safety copies are both set to '1' in my case.
Certainly if you can connect to your HCP's, pause your archiving, and then watch the item counters, the replicas will be at least five minutes before it catches up and all data is replicated, and in my case, they always end up identical and no errors shown at all.
If you DTrace this, you'll see entries such as this, 50 of them for each partition you look at before it gives up and chucks an error out to the event log:
If you can get hold of the Hitachi migration tool, you can connect to the file system and go and have a look and just check that the data is present on the replica, in my case after looking at this over and over, I can't find anything wrong and the events logged, whilst alarming and no doubt going to upset and confuse, seem at this point to be harmless information and a bit of a red herring, but I have thus far been unable to have that confirmed
01-16-2014 12:23 AM
Hi,
i would like to confirm that setting Local and remote safety copies both to '1' resolved problems regarding Saveset not being backed up error.
It's strange that by default this setting is 2 and i think it should be 1 :)
Regards.