cancel
Showing results for 
Search instead for 
Did you mean: 
Walker_Yang1
Level 5
Employee

The accelerator backup's behaviour for sparse file is different from the behaviour for normal file. Let's have a look what happed to sparse file.

First of all, let's simply take a look at what's sparse file.

What's sparse file?

A sparse file is a type of computer file, it has an apparent size which is larger than the amount of storage actually allocated to them. The usual way to create such a file is to seek past its end and write some new data, Unix-derived systems will traditionally not allocate disk blocks for the portion of the file past the previous end which was skipped over. The result is a “hole”, a piece of the file which logically exists, but which is not represented on disk. A read operation on a hole succeeds, with the returned data being all zeros.

2014-07-01_122749.png

2014-07-01_122808.png

 

How NetBackup deal with sparse file?

Relatively smart file archival and backup utilities will recognize holes in files, these holes are not stored in the resulting archive and will not be filled if the file is restored from that archive.

The way NetBackup deals with sparse file is similar to the way above.

NetBackup will identify the holes in sparse files, these holes are not stored in the backup images. Even if there is real zero data filled in the sparse file, NetBackup still considers it as holes, so these real zero data are not stored in the backup images.

NetBackup Accelerator backup for sparse file?

To take an example to describe the behaviour.

  1. Create a sparse file.

HOSTNAME:/walker # dd if=/dev/null of=spars-file1 bs=1k seek=2097152 count=1

0+0 records in

0+0 records out

0 bytes (0 B) copied, 1.4823e-05 s, 0.0 kB/s

HOSTNAME:/walker # ls -ls spars-file1

0 -rw-r--r-- 1 root root 2147483648 Nov 27 12:46 spars-file1

  1. Fill the sparse file with zero data.

HOSTNAME:/walker # dd if=/dev/zero of=spars-file1 bs=1k count=512000 conv=notrunc

512000+0 records in

512000+0 records out

524288000 bytes (524 MB) copied, 1.3924 s, 377 MB/s

HOSTNAME:/walker # ls -ls spars-file1

512504 -rw-r--r-- 1 root root 2147483648 Nov 27 12:46 spars-file1

 

Inode: 2346845   Type: regular    Mode:  0644   Flags: 0x0

Generation: 3652504436    Version: 0x00000000

User:     0   Group:     0   Size: 2147483648

File ACL: 0    Directory ACL: 0

Links: 1   Blockcount: 1025008

Fragment:  Address: 0    Number: 0    Size: 0

ctime: 0x50b445af -- Tue Nov 27 12:46:39 2012

atime: 0x50b44588 -- Tue Nov 27 12:46:00 2012

mtime: 0x50b445af -- Tue Nov 27 12:46:39 2012

Size of extra inode fields: 4

BLOCKS:

(0-11):9496038-9496049, (IND):9496050, (12-1035):9496051-9497074, (DIND):9497075, (IND):9497076, (1036-2059):9497077-9498100,

……

(125964-126987):9777086-9778109, (IND):9778110, (126988-127999):9778111-9779122

TOTAL: 128126

 

The apparent size is 2147483648 bytes, the filesystem allocates 128126 data blocks to the sparse file, so the physical size is 512504 KB.

  1. Start first full backup with accelerator.

Only about 2560 bytes data was sent to server. From the job details.

info bpbkar (pid=466958) accelerator sent 2560 bytes out of 2560 bytes to server, optimation 0.0%

 

From the bpbkar log, we also see the size of data sent to server.

bpbkar main: JBD - accelerator sent 2560 bytes out of 2560 bytes to server, optimization 0.0%

 

Although the sparse file occupies about 500 MB disk space, only about 2560 bytes data are sent to server and stored in the backup image. What happed?

As we stated above, NetBackup will consider the zeros data as hole, and the holes are not stored in backup image. Here NetBackup considers the 500MB data as hole, the 500MB data is not stored in the backup image, so only little data is stored in the backup image.

 

When backing up the sparse file with accelerator for the second time, there are two situations.

1) If the sparse file hasn't been change since the last full backup, it will speed up the backup, the optimization rate is about up to 99%.

2) If the sparse file has been changed since the last full backup, it will not speed up the backup, and backup all the whole sparse file, that is it will sent all the amount of the sparse file to server, not the changed data.

thanks

 

Version history
Last update:
‎07-08-2014 05:07 PM
Updated by: