cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup Backup Replication

TP81
Level 2

Hi, I have 2 Windows 2012 R2 Servers in different domains running NBU 7.6 Master Server on each.  I am trying to replicate a backup from one server to the other so that if one server goes down, there is a full backup on the other.  I have tried using AIR to achieve this but I don't seem to be getting the full image copied over to the other server, I understand that it uses dedupe so any data that is already on the server won't be copied, but even after the first backup the amount of data replicated is nowhere near what was backed up.  The backup job and replication job both show as successfully completed on one server and the import shows as successful on the other.

I did follow the guide here:http://www.settlersoman.com/how-to-configure-netbackup-auto-image-replication-air/ as I could not see any official veritas guide.

Is there something wrong with this setup or is there another way to do what I want?

1 ACCEPTED SOLUTION

Accepted Solutions

sdo
Moderator
Moderator
Partner    VIP    Certified

Gosh - I'm being a doughnut... the replication job shows c. 665MB replicated.  The import size is just the image header, and so only reports a token amount of meta-data.

To me, your backup and replication and import jobs all look totally fine.  That's the wonders of dedupe.

20.2GB of the original file data must have been seen before by both sites.

View solution in original post

11 REPLIES 11

sdo
Moderator
Moderator
Partner    VIP    Certified

If you could attach, as text files, the activity monitor detail for the backup job, the replication job, and the import job... then we can supply you with some commands to validate the backup images.

Please don't copy/paste the activity monitor text in to a reply.  Thanks.

Attached are the 3 files showing the job detail of my 20GB test file.

Thanks

sdo
Moderator
Moderator
Partner    VIP    Certified

All looks good so far.


From the backup job:
image: ted_1295970329
scanned: 20971536 KB, CR sent: 665467 KB, CR sent over FC: 0 KB, dedup: 96.8%
...size: 20,971,536 = about 20.8 GB
...new data: 665,467 = about 0.6 GB
...same data: = about 20.2 GB

Replication job:
image: ted_1295970329
scanned: 20971536 KB, CR sent: 665211 KB, CR sent over FC: 0 KB, dedup: 96.8%
...very similar numbers.

Import job:
Image: ted_1295970329
read: 8 KB, CR received: 2 KB, CR received over FC: 0 KB, dedup: 0.0%
...does look odd, but if the receiving/importing site already had a copy of nearly all of the 665MB of data that was once new at the original backup domain / site, then this is entirely normal.


The image date appears to be:
# bpdbm -ctime 1295970329
1295970329 = Tue Jan 25 07:45:29 2011
...which looks oddly/remarkably old.

Try these commands on both masters:

# bpclimagelist -U -Listseconds -client ted -s 01/01/1970 00:00:00 -ct 13

# bplist -B -l -c -t 13 -C ted -s 25/01/2011 07:45:29 -e 25/01/2011 07:45:29 -R 999 -I -PI *

 

Of course, the acid test is always to perform a restore in the second NetBackup domain, and MD5/SHA256 checksum both files (original and restored), or even manually copy over the original and then diff / fc the two files.

sdo
Moderator
Moderator
Partner    VIP    Certified

Gosh - I'm being a doughnut... the replication job shows c. 665MB replicated.  The import size is just the image header, and so only reports a token amount of meta-data.

To me, your backup and replication and import jobs all look totally fine.  That's the wonders of dedupe.

20.2GB of the original file data must have been seen before by both sites.

sdo
Moderator
Moderator
Partner    VIP    Certified

To see the backup and replication and import process for a file which won't (or shouldn't - but parts still might) deduplicate... then...

...here's a simple script to create a file containing random data:

Option Explicit
Dim go_fso
Dim gs_script_spec, gs_script_path, gs_script_name
Dim gs_arg_file, gs_arg_mb, gl_arg_mb
  Call s_init()
' Call s_method1()
  Call s_method2()
WScript.Quit

Sub s_init()
  Set go_fso = CreateObject( "Scripting.FileSystemObject" )
  gs_script_spec = WScript.ScriptFullName
  gs_script_path = go_fso.GetParentFolderName( gs_script_spec )
  gs_script_name = go_fso.GetBaseName(         gs_script_spec )
  gs_arg_file = gs_script_name & ".bin"
  gl_arg_mb   = 1
  Select Case WScript.Arguments.Count
  Case 0
  Case 1
    gs_arg_file = WScript.Arguments.Item( 0 )
  Case 2
    gs_arg_file = WScript.Arguments.Item( 0 )
    gs_arg_mb   = WScript.Arguments.Item( 1 )
    If Not IsNumeric( gs_arg_mb ) Then Call s_abort( "P2 must be numeric size for MB" )
    gl_arg_mb = CLng( gs_arg_mb )
    If gl_arg_mb < 1 Then Call s_abort( "P2 must be 1 or more" )
  Case Else
    Call s_abort( "too many arguments, only 2 allowed" )
  End Select
  Randomize
End Sub

Sub s_abort( ps_text )
  WScript.Echo ps_text
  WScript.Quit( 1 )
End Sub

Sub s_method1()
  Dim lo_chan, ls_out, ls_add, ll_i, ll_j, ll_MB
  Set lo_chan = go_fso.CreateTextFile( gs_arg_file, True )
  For ll_MB = 1 To gl_arg_mb
    WScript.StdOut.Write vbCr & ll_MB & " "
    ls_out = ""
    For ll_i = 1 To 1023
      Randomize
      ls_add = ""
      For ll_j = 1 To 1024
        ls_add = ls_add & Chr( Int( Rnd * 256 ) )
      Next
      ls_out = ls_out & ls_add
    Next
    ls_add = ""
    For ll_j = 1 To 1022
      ls_add = ls_add & Chr( Int( Rnd * 256 ) )
    Next
    ls_out = ls_out & ls_add
    lo_chan.WriteLine ls_out
  Next
  lo_chan.Close
End Sub

Sub s_method2()
  Dim lo_stream, lx_data(511), ll_i, ll_j, ls_buffer, ll_MB
  Set lo_stream = CreateObject( "ADODB.Stream" )
  lo_stream.Type = 1
  lo_stream.Open
  For ll_MB = 1 To gl_arg_mb
    WScript.StdOut.Write vbCr & ll_MB & " "
    For ll_j = 1 To 1024
      Randomize
      For ll_i = LBound( lx_data ) To UBound( lx_data )
        lx_data( ll_i ) = ChrW( ( Int( Rnd * 256 ) * 256 ) + Int( Rnd * 256 ) )
      next
      ls_buffer = Join( lx_data, "" )

      With CreateObject( "ADODB.Stream" )
        .type = 2
        .Open
        .WriteText ls_buffer
        .Position = 2
        .CopyTo lo_stream
      End With
    Next
  Next
  lo_stream.SaveToFile gs_arg_file, 2
  lo_stream.Close
End Sub

.

Save it as:   C:\temp\make-random.vbs

.

Run it with:

C:\temp> cscript make-random.vbs a.bin 100

...to create a file name "a.bin" containing 100 MB of random bits/bytes.

Thanks for taking a look.

I thought that this was the first time that the data was seen so I was expecting to see >20GB to have been replicated.  If the backups and replications expired, would this cause the full 20GB to be replicated again?

sdo
Moderator
Moderator
Partner    VIP    Certified

Any prior still retained backups containing similar data will always be used to dedupe currently running and future backups and/or incoming duplications and/or incoming replications.

It takes a day or so for any individual MSDP pool to truely fully completely totally de-reference chunks (i.e. blocks of data from prior now since expired backups) which are no longer referenced by any still retained backups... i.e. it takes a little while for these now lonely chunks to be expunged / removed from the dedupe meta-data container files / storage within the MSDP pool.

So, you could backup truely a random file, expire that backup.  Backup that same truely random file, and still have it dedupe 100% because the blocks from the first backup still have not yet been expunged.

Thanks, that clears everything up for me.

Thanks for the script, I'll give the script a try, that should give me an idea of what the performance will be.

sdo
Moderator
Moderator
Partner    VIP    Certified

There's actually a benefit from this behaviour, or a good design reason for this behaviour.  For example:

1) a weekly full backup runs on Sat 4th July at 18:00:00, and so the backup image is date-stamped then, and is saved to MSDP, and has a one week retention from the schedule or SLP, and so can be (and could be) expired in exactly one week - and this backup is quite large and takes 14 hours to run.

2) no daily backups.

3) on Saturday 11th July at 18:00:00 two things happen - the backup from last week could expire immediately (or any time shortly after 18:00:00, but definitely within 12 hours - i.e. before the second backup has completed), and what also happens is that another new weekly full backup starts at 18:00 and also takes 14 hours.

.

Luckily for us, because the dereferenced chunks from the first backup have not yet been expunged / removed / deleted from the dedupe container files and also not yet had their hash fingerprints removed, then the MSDP dedupe engine still has the fingerprints and still has the original chunks and so is able to dedupe the second backup because MSDP can effectively recall and bring back to life the dereferenced chunks which were pending actual final deletion from the dedupe containers - and so is able to dedupe the second backup without having to write the entire backup again to the dedupe disk pool.

sdo
Moderator
Moderator
Partner    VIP    Certified

For anyone scoping and sizing backups - this has implications.  E.g. if you backup 1TB of compressed Oracle RMAN backups (i.e. totally random data) (i.e. dedupe poison) every week, and have an eight week retention in MSDP, then really you need to factor / size / scale / scope for nine weeks, i.e. 9TB of MSDP disk usage.

Of course, any backup admins and engineers actually using dedupe (from any vendor) will have already negotiated with their DBAs (SQL and/or Oracle and/or others) to get the database backups written in "non-compressed" form.  You did didn't you?  ;)

But you get the idea... which is... there is a slight delay in expunging / deleting / removing the completely dereferenced chunks.  Pretty much all dedupe strategies will suffer this.  i.e. this is not a NetBackup specific shortcoming, because AFAIK pretty much all dedupe engines will suffer from this.

sdo
Moderator
Moderator
Partner    VIP    Certified

You want to be a bit careful with that script.

It contains two methods of writing a file of random contents.

Method 1 - writes directly to a file, slower, but uses virtually no RAM.

Method 2 - buffers the file contents in RAM, faster, but uses RAM.

.

So, the script above, has method1 commented out, and method2 is active.

So, if I were to say run:  cscript make-randon.vbs big-file.bin 2000

...then this will consume 2GB of RAM on the server, before the contents are flushed to disk.

So, if you want to create a 20GB file of random data, then you have two options:

1) run the script 20 times, creating a 1GB file each time, and then append the 20 files together.

2) amend the script to comment out the call to method2, and uncomment the call to method1, and run it for 20GB and just accept that it will take quite a while to run.

HTH.