Forum Discussion

jstoffel-tosh's avatar
8 years ago

NDMP, Netapp cDOT 8.2, and parent policy to coordinate snapshot creation?

Hi,

I'm trying to coordinate the creation of backup Snapshots when I backup two seperate, but related volumes of data.  Because we're running NBU 7.7.3 on Solaris, and Netapp cDOT 8.2, we cannot do CAB (Cluster Aware Backups) until we upgrade to cDOT 8.3 sometime in the future.  Which is a pain.  I've got two volumes, Foo & Bar, which need to be snapshotted at the same time, but are on seperate nodes.  Would using a parent policy with two sub-policies be the way to make this happen?  We don't need the actual NDMP backups to be in parallel at all.

I've been looking at the docs, and we've done bpstart_notify.<POLICY> scripts in the past, but I'm a bit stumped on how I should do it now?

If we put both volumes into one policy, then it won't run properly because the policy is bound to a single Node on the Netapp.  And of course we have the volumes spread across multiple nodes. 

Thanks,

John

5 Replies

  • Have ben a while since I worked with NetApp, but think it can be done by having a Netbackup policy or maybe a netapp schedule creating the snapshots and another Netbackup policy to backup the snapshots.

    Back then you could backup the volume snapshot by use /volume/.snaphot in the backup selection.

     

    • jstoffel-tosh's avatar
      jstoffel-tosh
      Level 3

      We thought about setting up a policy to schedule two backups at the same time, but what happens if there is only one tape drive available?  Only one policy will fire and the snapshots won't be coordinated.

      Right now I'm trying to write a script to do all the pre-setup of the volumes, which is to create a pair of flexclone volumes at the same time.  Then it runs bpbackup -w 0 -i .... to run the two policies.  Now I'm running into the problem where even though the path is there, it fails. 

      Does anyone know if NBU keeps track of the volume 'dsid' value?  So even if the path is the same, if the dsid is different, it fails to do the backup.

       

      So what might be even better is if I can just create a snapshot on the parent volume and point NBU (7.7.3 Sparc) to the full path of that snapshot which I've created, and backup that instead using NDMP.  This was trivial in CommVault 8 & 9. 

      Do I need to install the Snapshot director module?  Gah!  What a pain!

       

  • It sort of sounds like a crontab entry that executes a script or references both policies would be able to start them at the same time. Just make sure you give it a really high priority and maybe find a way to dedicate media server and tape drive resources to it so it's insulated from resource competition/contention.

    • jstoffel-tosh's avatar
      jstoffel-tosh
      Level 3

      So this is what I've done and it almost works reliably.  Basically, I have a script called by cron which creates two flexclone volumes off the two parent volumes at the same time (within a few seconds of each other) and then kicks off the two dedicated policies which then do the backups.

      Worked great  for two days, but now I'm getting error 23 when I run the script again.  It's as if Netbackup doesn't like the name or path or volume ID and thinks that it's not the same volumes to backup.  What a pain!

       

      Does anyone know if I can tell Netbackup to just backup a specific snapshot instead?  If I give it a path like:

       

      /vserver/volume/.snapshot/backup_snap

       

      would that work?  Time to test it out I guess.  

      • jstoffel-tosh's avatar
        jstoffel-tosh
        Level 3

        And the answer is no... this is frustrating because I can do stuff like this under 7-mode, but the documentation for cDOT (Cluster OnTap and NetBackup is annoying vague).  Here's the logs:

         


        03/28/2017 08:23:32 - Info nbjm (pid=22230) starting backup job (jobid=50279) for client ntap1-n2.na.toshiba.local, policy TestNDMP.NAR12.n2, schedule Full
        03/28/2017 08:23:32 - Info nbjm (pid=22230) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=50279, request id:{80BE7778-13CA-11E7-AC1D-00053326F93E})
        03/28/2017 08:23:32 - requesting resource NDMP_ntap1
        03/28/2017 08:23:32 - requesting resource ssc-vbku-lunp01.NBU_CLIENT.MAXJOBS.ntap1-n2.na.toshiba.local
        03/28/2017 08:23:32 - requesting resource ssc-vbku-lunp01.NBU_POLICY.MAXJOBS.TestNDMP.NAR12.n2
        03/28/2017 08:23:32 - granted resource ssc-vbku-lunp01.NBU_CLIENT.MAXJOBS.ntap1-n2.na.toshiba.local
        03/28/2017 08:23:32 - granted resource ssc-vbku-lunp01.NBU_POLICY.MAXJOBS.TestNDMP.NAR12.n2
        03/28/2017 08:23:32 - granted resource 000086
        03/28/2017 08:23:32 - granted resource HP.ULTRIUM6-SCSI.005
        03/28/2017 08:23:32 - granted resource ssc-vbku-lunp01-hcart3-robot-tld-0-ntap1-n2
        03/28/2017 08:23:32 - estimated 541 kbytes needed
        03/28/2017 08:23:32 - Info nbjm (pid=22230) started backup (backupid=ntap1-n2.na.toshiba.local_1490714612) job for client ntap1-n2.na.toshiba.local, policy TestNDMP.NAR12.n2, schedule Full on storage unit ssc-vbku-lunp01-hcart3-robot-tld-0-ntap1-n2
        03/28/2017 08:23:33 - Info bpbrm (pid=12149) ntap1-n2.na.toshiba.local is the host to backup data from
        03/28/2017 08:23:33 - Info bpbrm (pid=12149) reading file list for client
        03/28/2017 08:23:33 - Info bpbrm (pid=12149) starting ndmpagent on client
        03/28/2017 08:23:33 - started process bpbrm (pid=12149)
        03/28/2017 08:23:33 - connecting
        03/28/2017 08:23:33 - connected; connect time: 0:00:00
        03/28/2017 08:23:34 - Info ndmpagent (pid=12155) Backup started
        03/28/2017 08:23:34 - Info bpbrm (pid=12149) bptm pid: 12156
        03/28/2017 08:23:34 - Info ndmpagent (pid=12155) PATH(s) found in file list = 1
        03/28/2017 08:23:34 - Info ndmpagent (pid=12155) PATH[1 of 1]: /vsNA1/test_lpd01_u02_nar12/.snapshot/backup_snap
        03/28/2017 08:23:35 - Info bptm (pid=12156) start
        03/28/2017 08:23:35 - Info bptm (pid=12156) using 30 data buffers
        03/28/2017 08:23:35 - Info bptm (pid=12156) using 65536 data buffer size
        03/28/2017 08:23:36 - Info bptm (pid=12156) start backup
        03/28/2017 08:23:36 - Info bptm (pid=12156) Waiting for mount of media id 000086 (copy 1) on server ssc-vbku-lunp01.
        03/28/2017 08:23:36 - mounting 000086
        03/28/2017 08:24:55 - Info bptm (pid=12156) media id 000086 mounted on drive index 5, drivepath nrst11a, drivename HP.ULTRIUM6-SCSI.005, copy 1
        03/28/2017 08:24:55 - mounted 000086; mount time: 0:01:19
        03/28/2017 08:25:04 - Info ndmpagent (pid=12155) ntap1-n2.na.toshiba.local: SCSI: TAPE READ: short read for nrst11a
        03/28/2017 08:25:05 - positioning 000086 to file 1
        03/28/2017 08:25:10 - Info ndmpagent (pid=12155) NDMP Local - tape host and data host match
        03/28/2017 08:25:10 - positioned 000086; position time: 0:00:05
        03/28/2017 08:25:10 - begin writing
        03/28/2017 08:30:13 - Error ndmpagent (pid=12155) NDMP backup failed, path = UNKNOWN
        03/28/2017 08:30:13 - Error ndmpagent (pid=12155) ndmp_data_get_state_failed, status = 12 (NDMP_EOF_ERR)
        03/28/2017 08:30:13 - Critical bpbrm (pid=12149) unexpected termination of client ntap1-n2.na.toshiba.local
        03/28/2017 08:30:13 - Error bptm (pid=12156) io_ioctl_ndmp (MTBSF) failed on media id 000086, drive index 5, return code 18 (NDMP_XDR_DECODE_ERR) (../bptm.c.8439)
        03/28/2017 08:30:14 - Info bptm (pid=12156) EXITING with status 23 <----------
        03/28/2017 08:30:14 - end writing; write time: 0:05:04
        socket read failed (23)