Forum Discussion

JeanB's avatar
JeanB
Level 4
8 years ago

Disk staging to tape causes status 196 on regular policies

Weekly full backups are staged to an EMC Data Domain basic disk staging storage unit.  Staging to a 6 drive LTO4 robot is triggered by a schedule at 3PM which takes more than the sheduled window to complete.  Tape-only policies that are sceduled at 11PM fail with status 196 because all drives are occupied by the staging.  How can I ensure that the staging has less priority than regular backup policies?  Is this recommended?   I want to avoid tweeking with policy priorities which are all at 0.  The staging schedule has a priority of 90000 which makes no sense to me.

  • Why start staging only at 3pm?
    Best to schedule staging fairly often. Like every 2-3 hours.
    • JeanB's avatar
      JeanB
      Level 4

      I was thinking of setting it every hour with a 24 hour window.  We started having these problems when we transfered more than 3TB of backups to this staging disk unit.  Duplication takes ages.  How can the size of the duplication be evaluated?  Also,  has we are not using SLPs, I understand that high and lower water marks are not used. They are set at 90% and 75%, the unit is currently full to 72%.  My understanding is that all backups done using this storage unit since the last staging will be duplicated, disregarding the water marks.

      • Nicolai's avatar
        Nicolai
        Moderator

        You can speed up tape transfer by setting SIZE_DATA_BUFFERS = 262144 and NUMBER_DATA_BUFFERS to 256. This tewak can increase performance up to 50% compared to default values.

        Stage during day time and backup during night time. Mixing work load will effect performance.

  • Let's get back to basics - 

    You have backup requirement for xxx amount of data.
    You have duplication requirement for xx amount of data.
    You have x hours in which to complete these tasks.
    You have resources capable of reading data at ?/sec. (this includes read speed from client disk and disk STU)
    You have resources capable of writing data at ?/sec (including network transfer rate)

    If you don't know what each of these points amount to, then you have you homework cut out.

    How long do the backups to disk take to complete? What transfer rates are your seeing?
    How long do duplications take to complete? At what transfer rate?
    If you are not seeing all 6 tape drives writing duplications to 6 drives at more than 100MB/sec, then you need to look at the entire data path.
    If you are using a single master/master server, you need to verify that this server has sufficient resources for each of the processes involving backups and duplications.

    About disk staging - HWM does not play a role in duplication scheduling - this is the task of the schedule in the DSSU schedule that you configure.
    HWM is what starts disk cleanup.

    Herewith some reading matter about disk staging: 

    DOCUMENTATION: Description of NetBackup Disk Staging Relocation Behavior 
    http://www.veritas.com/docs/000030287

     

    Disk Staging Storage Unit (DSSU) cleanup behavior  
    http://www.veritas.com/docs/000036495

     

    • Nicolai's avatar
      Nicolai
      Moderator

      Both X2 and Marianne posts are very valid and basically is the essence in ensuring backup windows.

      You really need to figure out how bad or good duplication speed are.  Please look at the NUMBER_DATA_BUFFERS/SIZE_DATA_BUFFERS tuning I suggested before. It  may be simple as that in resolving the replication backlog you have.

      How to configure the buffer settings is documented in "Netbackup backup planning and performance tuning guide"

       

       

      • JeanB's avatar
        JeanB
        Level 4

        I need to identify the problem that causes one of the media servers to take more than 43 hours to duplicate 18TB (still not completed as I'm writing). The other media server (identical hardware) took 26 hours for 12 TB, time which was mostly spent on one 4TB job.