Forum Discussion

Tomas_Pospichal's avatar
2 years ago

Duplication: Second Copies not expiring

Hello,

I would like to ask you for a hint with SLP and duplication. On my environment, there are many images (only second copies, the first copies are gone already) that should have been expired days/weeks ago but nothing has happend actually.

Example of copy:
Master Server : xxxx
Backup ID : xxxx_1672451498
Copy Number : 2
Copy Type : 1 (DUPLICATE)
Expire Time : 1675475498 (2023/02/04 02:51:38)
Expire LC Time : 1675475498 (2023/02/04 02:51:38)
Try To Keep Time : 1675475498 (2023/02/04 02:51:38)
Residence : xxxx
Copy State : 5 (COPYCOMPLETE)
Job ID : 111626461
Retention Type : 0 (FIXED)
MPX State : 0 (false)
RetryCount : 1
Last Retry Time : 1675417412 (2023/02/03 10:43:32)
Source : 1
Destination ID : (none specified)
Replica: : 0
DataFormat : 1 (DF_TAR)
SLP Index : 2

 

Is there any way how to purge such data/duplicates?

Thank you.
Tom

11 Replies

  • When a image is under SLP control, the image is set to retention infinity until the copy process is done. Then the original retention is applied.

    in the SLP cheat sheet there are various command to check for incomplete SLP images - do a cross check with the images that should have expired.

    Storage Lifecycle Policy (SLP) Cheat Sheet

    https://www.veritas.com/support/en_US/article.100006475

    If everything is OK, then run a manual image cleanup 

    /usr/openv/netbackup/bin/admincmd/bpimage -cleanup -allclients

    if you want to cancel incomplete SLP, you can do that with the command 

    nbstlutil cancel 

    https://www.veritas.com/support/en_US/doc/123533878-127136857-0/v123553953-127136857

    Don't do this as first action, investigate why SLP don't complete.

     

    • Tomas_Pospichal's avatar
      Tomas_Pospichal
      Level 4

      Hello Nicolai,

      that's the problem, that I'm getting errors RC88 from automatic and manual image cleanup.

      Example:
      Feb 16, 2023 6:46:31 AM - Info bpdm (pid=20188) Failed to delete WORM locked image xxxxxx_C2_TIR_R1: error 2060069.
      Feb 16, 2023 6:46:31 AM - Info bpdm (pid=20188) Failed to delete WORM locked image xxxxxx_1672121642_C2_F1_R1: error 2060069.
      Feb 16, 2023 6:46:31 AM - Critical bpdbm (pid=14432) WORM unlock time is updated from <1678257213> to <1678257293>. Cleared delete pending flag for copy <2> of bid <xxxxxx_1672121642>.
      Feb 16, 2023 6:46:32 AM - Info bpdm (pid=20188) Failed to delete WORM locked image xxxxxx_1672121642_C2_F2_R1: error 2060069.

      This is something what slows done whole duplication processing at all because due to some reasons, the second operation (duplication) of SLPs did not run for more than 1 month in row, so the backlog grew to the enormous numbers...

      I actually do not need to copy or have copies of backups that should have been expired weeks ago, so solution would be also to stop/cancel/remove/erase/whatever everything from - let's say - November 2022 till mid of January 2023. However I do not know how to proceed with that. Do you have any idea?

      So far I don't understand why the second copies are still there, since they should have been removed some time ago based on "expiration date" from the example output above.

      Thank you for any input.
      T.

  • run the following command and post the output 
    bpexpdate -d 0 -backupid xxxx_1672451498

    Note that if the SLP for this image is finished then, the command will delete the backup image.

    • Tomas_Pospichal's avatar
      Tomas_Pospichal
      Level 4

      Hello StefanosM ,

      I used different backup image, since there are currently tens of thousands not expired backup images facing the same situation - first copy is gone, but second stays and it's not possible to remove it.

      bpexpdate -d 0 -backupid  xxxx_1671497636
      Are you SURE you want to delete xxxx_1671497636 y/n (n)? y
      Expiration for Open Storage WORM cannot be shortened.

       

      C:\Windows\system32>nbstlutil list -backupid xxxx_1671497636 -U
      Image:
      Master Server :xxxx
      Backup ID : xxxx_1671497636
      Client : xxxx
      Backup Time : 1671497636 (2022/12/20 01:53:56)
      Policy : xxxx
      Client Type : 40
      Schedule Type : 1
      Storage Lifecycle Policy : xxxx
      Storage Lifecycle State : 3 (COMPLETE)
      Storage Lifecycle Is Inactive : false
      Time In Process : 1671499067 (2022/12/20 02:17:47)
      Data Classification ID : (none specified)
      Version Number : 4
      OriginMasterServer : (none specified)
      OriginMasterServerID : {00000000-0000-0000-0000-000000000000}
      Import From Replica Time : 0 (0 secs)
      Required Expiration Date : 0 (0 secs)
      Created Date Time : 1671497636 (2022/12/20 01:53:56)

      Copy:
      Master Server : xxxx
      Backup ID : xxxx_1671497636
      Copy Number : 2
      Copy Type : 1 (DUPLICATE)
      Expire Time : 1674521636 (2023/01/24 01:53:56)
      Expire LC Time : 1674521636 (2023/01/24 01:53:56)
      Try To Keep Time : 1674521636 (2023/01/24 01:53:56)
      Residence : xxxx
      Copy State : 5 (COPYCOMPLETE)
      Job ID : 110913060
      Retention Type : 0 (FIXED)
      MPX State : 0 (false)
      RetryCount : 1
      Last Retry Time : 1674546262 (2023/01/24 08:44:22)
      Source : 1
      Destination ID : (none specified)
      Replica: : 0
      DataFormat : 1 (DF_TAR)
      SLP Index : 2

      Fragment:
      Master Server : xxxx
      Backup ID : xxxx_1671497636
      Copy Number : 2
      Fragment Number : -1
      Resume Count : 1
      Media ID : @aaaep
      Media Server : xxxx
      Storage Server : (none specified)
      Media Type : 0 (DISK)
      Media Sub-Type : 6 (STSDYNAMIC)
      Fragment State : 1 (ACTIVE)
      Fragment Size : 178718
      Delete Header : 0
      Fragment ID : @aaaep
      Snap MountHost : (none specified)
      Media Description : 1;DataDomain;xxxx;DP_xxxx;xxxx_ddboost;0

      Fragment:
      Master Server : xxxx
      Backup ID : xxxx_1671497636
      Copy Number : 2
      Fragment Number : 1
      Resume Count : 1
      Media ID : @aaaep
      Media Server : xxxx
      Storage Server : (none specified)
      Media Type : 0 (DISK)
      Media Sub-Type : 6 (STSDYNAMIC)
      Fragment State : 1 (ACTIVE)
      Fragment Size : 89645020160
      Delete Header : 0
      Fragment ID : @aaaep
      Snap MountHost : (none specified)
      Media Description : 1;DataDomain;xxxx;DP_xxxx;xxxx_ddboost;0

  • OK, it is clear now.

    The datadomain configuration is protecting your backups from deletion.
    Your problem is that the retention lock in the data domain is higher than the retention of the backup.

    You have to understand how the datadomain retention lock works and configure it according to your needs.
    According to the theory, you configure retention lock with the minimum amount of time you want the datadomain to protect your backups from deletion. The retention lock must be lower than your retention.
    There is a way to release the lock from the datadomain and expire your backups

    Unfortunately, I am unable to assist you further because I lack experience with data domains.

     

  • Hi Tomas_Pospichal 

    I see : Feb 16, 2023 6:46:31 AM - Info bpdm (pid=20188) Failed to delete WORM locked image

    WROM = Write once read many. Status code 88 also indicate issues related to WORM.

    /Nicolai

  • Good catch StefanosM  I missed storage device was a Data Domain.

    I agree, if retention lock is configured on the data domain, it will mess with Netbackup backup management.

    If retention lock is configured, the lock time must be shorter than backup image retention.

    /Nicolai

  • StefanosM Nicolai I think I found out why the copies were not removed. All is impacted by the one month back log which happened last year.. it seems that it's really working as designed.

    As from the example above:
    Backup ID : xxxx_1671497636
    Backup Time : 1671497636 (2022/12/20 01:53:56) 

    Copy
    Backup ID : xxxx_1671497636
    Copy Number : 2
    Copy Type : 1 (DUPLICATE)
    Expire Time : 1674521636 (2023/01/24 01:53:56)
    Last Retry Time : 1674546262 (2023/01/24 08:44:22)

    So the backup was done on 2022/12/20 and at the same time, the record for the copy number 2 was created with calculated expiration time on 2023/01/24. However the duplication job finished on 2023/01/24 so therefore there was applied new retention from that date. So that image should be gone by 2023/02/28. Are my assumptions correct or not?
    The retention for backup and duplication is same - 5 weeks.

    To remove it manually, I will have to manually revert retention lock settings on DDs I guess.


  • Hi Tomas_Pospichal 

    As i recall, and StefanosM help me here, retention is always calculated from backup creation date. But hey the 24th is close, so why not wait and see what happens :)

    I suggest you remove the retention lock on the data domain, if Netbackup cannot delete a image, it doesn't forget about them, it will retry the operation until it succeed or you tell Netbackup to do a force delete, but force delete will then result in orphaned images on the data domain taking up space until end of time. You don't what that to happen.

    Removing retention lock must be done from the DDOS CLI.