Forum Discussion

DPO's avatar
DPO
Level 6
5 years ago

Understanding data size written on tapes

We have a backup copy for one of the clients with various images and its total size is 15 TB. And we have LTO-6 tapes in our environment(with native compressed capacity of 6.25 TB each). But when we see the data only consumed two tapes. We don't use NetBackup compression and I think tape library hardware must be doing hardware compression. How do I interpret the actual size on the tapes. Although 2 tapes can max. accommodate 12.5 TB , how did 15TB data fit onto them.

Are there any supporting documents to understand this. I also verified in the catalog all images are present in only two tapes.

Also what is the block size used by NetBackup Appliances to tapes ? If we try to import this tape in another environment, do we need to set the block size to match with original backup server?

Appreciate prompt response.

 

  • Block size is whatever the buffer is set to in /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS

    For example, if you had 262144 in here, tehe buffer size is 256k (262144/ 1024).

    Default, if no file is 64k, which will give aweful performance.  No way to say straight off which settingsa re best, it's a tuning value and differs between systems.  A good starting point for LTO drives is 262144, or 524288.  NUMBER_DATA_BUFFERS  also plays a part, 128 or 256 is a starting point.  The default of 30 is not sufficient.

    You cannot change the size of he buffers for restore, if the tape is written at 256k, it wil be read at 256k - no way to change it.

    LTO uses hardware compression by default, or should do - this is way more efficient than software compression, and depending on data type you can get various amounts of data on a tape - if for example you are writing data base sparse files, you can get very large amounts on (well, if the files are empty).

    There are many posts around this subject, in summary, NBU has no control over the amount of data on a tape, it doesn;t even understand tape size.  If left alone, it would try and write to the same tape for ever.  NBU only marks a tape full when the end of tape is detected by the tape drive, the tape driver then informs the OS, which in turn informs NBU that the tape is full.  So any issues with amount of data on a tape (usually it's only an issue if the tapes are marked full below the native (uncompressed) capacity   when you will have an issue either with the tape driver, the firmware or even possibly a faulty drive  - it's impossible for NBU to be the cause.

    • DPO's avatar
      DPO
      Level 6

      Thank you again...

      One last question,

      Does the size of data buffers have real impact during restore ? Out of 2 tapes , we were successfully able to import 1 tape and 1 tape reported read error. "bpimport status = media read error" on the destination master server.

      While we checked on the source master server, there were no write errors reported. What else could be the problem ? The same drivers were able to read one tape but failed for other tape.

      • mph999's avatar
        mph999
        Level 6

        I missed a bit

        "Also we took another copy of the same images earlier but that time it took 3 tapes and the 3 of them were never used(scratch tapes). How would I technically justify this?"

        Kinda following on from my previous comment, talk to the tape drive vendor, as per my explanation, absolutly nothing we can do to control this in NBU.

        Yes, buffer size can make a massive difference in performance - it can be the differene between the drive runnining at of near it's maximm speed, to so slow, it's quicker to write the data out by hand ...

        Drives running slow have to stop-start, as they run out of data to write andhave to wait, this can absolutly wreak them - LTO are not designed for this, they need to 'stream' - so buffer size has more than just an impact on performance.

        "bpimport status = media read error"
        Impossible to say - NBU does not read or write to tapes itself, that is all done by the operating system, we just send the data and say please get this on a tape using buffer size X.  The majority or read /write errors are outside of NBU (despite popular opinion) - but we are ususally looking at worn out drives/ faults / firmware issues etc ...

        Tape in theory is massivly reliable, more so than disk, but only it treated correctly - kept in the correct environmental conditions, made sure data always streams to it, not handled incorrectly (dropped, knocked about when being moved) ...  but even so, it will wear out eventually, depending on it's age.  That said, it's harder to write to a tape then read it ('signal' strength needs to be higher to write), so I'd expect, for a worn out tap,  the writes to start to fail before the reads.

        Could be worth trying the tape in another drive, sometimes drives have an issue only with certain tapes (usually when thing are older and getting worn).

  • If you run the command "bpmedialist -m <MEDIA_ID>" for each of the two tapes, the output will include in the summary what NetBackup has written to tape (including number of images, kbytes written). 

    If you try the same command with a -mlist option ("bpimagelist -mlist -m <MEDIA_ID>"), the output will contain details about each image fragment written to the tape. 

    Refer to the commands reference guide for more details on the bpimagelist command.

    The fact the you have managed to "squeeze" 15TB of data onto two LTO6 drives is good fortune - it implies your data is very compressable. The 6.25TB figure is only an average marketing figure, the native capacity of the tape is 2.5TB, anything beyond that is due to compression - depending on the compressability of the data being written the actual amount of data one tape can store will vary from 2.5 and up.

    • DPO's avatar
      DPO
      Level 6

      Thanks for the prompt response. I'm pretty much interestd in this and do we have any supporting documents/articles for it.

      The 6.25TB figure is only an average marketing figure, the native capacity of the tape is 2.5TB, anything beyond that is due to compression - depending on the compressability of the data being written the actual amount of data one tape can store will vary from 2.5 and up.

      Also we took another copy of the same images earlier but that time it took 3 tapes and the 3 of them were never used(scratch tapes). How would I technically justify this?

      Also what is the block size used by NetBackup Appliances to tapes ? If we try to import this tape in another environment, do we need to set the block size to match with original backup server?

  • I have LTO5 tapes, and I get full tapes anywhere from 1.4 to 5.3TB, averaging around 3TB per tape. 

    I track my tape usage for this very reason, so I can guestimate tape needs. 

    Total TB / capacity will always be MORE tapes, which is good, you always err on the side of too many tapes.

    I always caution to factor in the number of drives, since how ever many drives the data is spread over will always add that many partially full tapes...

    1/19 I vaulted 60 tapes with 115TB of data

    1/20 I vaulted 60 tapes with 147TB of data

    1/31 I vaulted 66 tapes with 180TB of data!

    • mph999's avatar
      mph999
      Level 6

      The reason phase 1 works, is that this only reads the backup headers on the tape, it kinda skips through the actual data.

      WHen you run phase 2, this actulally reads all the data, which is why it takes longer, and, it is something in the data part of the backup that is causing the issue.

      • davidmoline's avatar
        davidmoline
        Level 6

        Fuirther to what mph999 says, the phase 1 import due to what it is reading is less impacted by the block size used to create the tape. While the phase 2 import where it reads the actual data is very dependant on the block size used to create the tape. 

        Are the tapes from the same source and would you expect them to have been created using the same block size etc?

        For the import - the relevant log files will be on the media server not the master (unless they are one in the same of course)