Forum Discussion

Slartybardfast's avatar
4 years ago

HPE OST Troubleshooting guidance

Good evening Gurus,

Does anyone have any documentation/diagrams on how the OST plugin works. I am having issues with possibly media servers to HPE StoreOnce 5650 communication. I have trawled through the bpdm, bptm (Media Servers), catalyst and plugin logs (catalyst & plugin logging turned up to debug) Backup to the local disks (CatStore) on the StoreOnce works fine, it all goes to pot when trying to create a duplication to S3 cloud bank store. From the netbackup logs I am seeing errors opening images. From image cleanup I am seeing errors deleting images. From memory in the detail status log for jobs I am seeing error 83,84 and error 2060022 reported from the software, eventually the SLP duplication fails with overall error of 191. Another error from the catalyst log which I think could be key here is OSCTL_ERROR_DEDUPE_TIMEOUT (I hope thats right). I dont have the exact error messages to hand and will update the post tomorrow morning.. Environment is Windows 2012 R2 for master and media servers. NetBackup version 8.1.2. Just hoping for a little guidance so I can get my ducks in order before I may have to open a support case.

  • Good evening all,

    As promised I am reporting back on the findings from the vendors. It has been a harrowing few weeks with requests for logs, commands being run then more logs than I could shake a stick at. Well we at least can now access the images on the cloud bank store albeit we will never write to this area again.

    Some interesting  information came back on our design and use of the StoreOnce. Firstly the house keeping processes run with a low priority and you only have a house keeping threads per catstore or cloud bank store. Our original storeOnce design was installed and configured by the vendor. They created a single store and and cloud bank store for each major file type. So we had for argument sake a Catstore which we backed up to over night, then during the day the opposite site duplicated copy 2 into the same catstore. With the thread priority being very low this caused huge house keeping back log due to the constant I/O activity. Secondly we were told that maximum size on disk is recommended at 50TB. We blew that figure out of the water by a factor 3. Other limits were also explained that maximum number of data sessions is 1024 and the maximum number of stores is 96. That includes local stores and cloudbank stores. In fairness those limits are in there guide. What is not there guide is the 50TB on disk value. My memory vaugely server me "almost unlimited" but I could be just getting old. So the moral of this story is have seperate catstores for backups and duplications so housekeeping at least gets a 12 hour window to do what its meant to do. Don't blow the 50TB on disk recommended maximum either. We are now waiting for the images to expire so we can delete both areas and write this all off a bad dream.

8 Replies

  • Slartybardfast 

    I am reading through your post again.

    Have you verified that your config is supported by HPE?

    I see the following in the HPE guide:

    See the HPE Data Availability, Protection and Retention Compatibility Matrix for supported StoreOnce Systems and object storage.
    Cloud Bank Storage stores are supported as copy, not backup, targets. Configuration and integration with data protection software is similar to a standard StoreOnce Catalyst store.

     

    • Slartybardfast's avatar
      Slartybardfast
      Level 5

      Hi Marrianne,

      Thanks for your response. I have read and re-read the OST documentation. We are using it as a copy target and not as a backup target. We are checking and rechecking connectivity. I think this issue is external to NetBackup. The StoreOnce is responsible for the copying data into S3 and that is what is failing. Backup to the local CatStore by the media servers is working, So that rules out any OST plugin issue. I could be completey off piste with that assumption.

      • davidmoline's avatar
        davidmoline
        Level 6

        HI Slartybardfast 

        You are correct in that NetBackup will treat the cloud bank as just another Catalyst store (and target). NetBackup has no idea where this store exists (i.e. in the "cloud"), not that it is located on object storage - this is all internal to the StoreOnce. 

        Given that you appear to be having issues duplicating the images - have you also tried to restore from them? 

        I think you have the logs files covered from the media server point of view.

        Another thing you could possibly try is to create a second catalyst store in the HPE and setup NBU to use this as a copy target as well to see how that works (once verified one way or another, it shouldn't be too difficult to expire images and remove the device from NBU). 

        David

  • Good evening all,

    As promised I am reporting back on the findings from the vendors. It has been a harrowing few weeks with requests for logs, commands being run then more logs than I could shake a stick at. Well we at least can now access the images on the cloud bank store albeit we will never write to this area again.

    Some interesting  information came back on our design and use of the StoreOnce. Firstly the house keeping processes run with a low priority and you only have a house keeping threads per catstore or cloud bank store. Our original storeOnce design was installed and configured by the vendor. They created a single store and and cloud bank store for each major file type. So we had for argument sake a Catstore which we backed up to over night, then during the day the opposite site duplicated copy 2 into the same catstore. With the thread priority being very low this caused huge house keeping back log due to the constant I/O activity. Secondly we were told that maximum size on disk is recommended at 50TB. We blew that figure out of the water by a factor 3. Other limits were also explained that maximum number of data sessions is 1024 and the maximum number of stores is 96. That includes local stores and cloudbank stores. In fairness those limits are in there guide. What is not there guide is the 50TB on disk value. My memory vaugely server me "almost unlimited" but I could be just getting old. So the moral of this story is have seperate catstores for backups and duplications so housekeeping at least gets a 12 hour window to do what its meant to do. Don't blow the 50TB on disk recommended maximum either. We are now waiting for the images to expire so we can delete both areas and write this all off a bad dream.

    • StefanosM's avatar
      StefanosM
      Level 6

      interesting.

      I have some installations with HP storeonce and I do not have these problems. We have some catalist stores over 50TB in size with no problem using them.
      I know that HP suggests to split backup types to different stores, but no one can provide a clear answer to me, why I have to do that. The only acceptable answer, that has a meaning, is that they propose it so that they can have good deduplication rating for reporting.

      We do not follow that rule. But we use different stores for backups and duplications.

      A problem we have with storeones is that if we use FC connectivity to media server to backup an exchange server, the mailbox indexing is not working (backup error 84). If we do not use GRT or we use network, the backup is fine.