Forum Discussion

t_jadliwala's avatar
10 years ago

rehydration is slow from msdp(5230) to sl500 RL

Hello Friends,
I need yr help to fix the issue. I am having netbackup in guc cluster.
ON Main site there are 3 media servers 
1) Main media1 - windows vwware host
2)Main media 2 - 5220 appliance
3) Main media 3 - 5230 apliance
The same setup on dr site.
1) dr Main media1 - windows vwware host
2)dr Main media 2 - 5220 appliance
3) dr Main media 3 - 5230 apliance
I have created a storage unit 


we have sl500 robotic library which is connected to dr main media 2 

Every month we are doing a tape based backup 


I have created an slp in which my first backup is going at Main media 3(5230) appliance then it duplicated to dr main media 3(5230) appliance and then its duplicated to sl500 

when its going from dr main media3 server to robotic library even 1 gb of backup is taking close to 50 min. Is there a way to do increase the performance of the backup 

 

drmain-media-3.Settings> NetBackup DataBuffers Number Show
NUMBER_DATA_BUFFERS : 30 (Default)
NUMBER_DATA_BUFFERS_DISK : 30 (Default)
NUMBER_DATA_BUFFERS_FT : 16 (Default)
NUMBER_DATA_BUFFERS_RESTORE : 30 (Default)


main-media-3.Settings> NetBackup DataBuffers Size Show
SIZE_DATA_BUFFERS : 262144 (Default)
SIZE_DATA_BUFFERS_DISK : 262144 (Default)
SIZE_DATA_BUFFERS_FT : 262144 (Default)


main-media-2.Settings> NetBackup DataBuffers Number Show
NUMBER_DATA_BUFFERS : 30 (Default)
NUMBER_DATA_BUFFERS_DISK : 30 (Default)
NUMBER_DATA_BUFFERS_FT : 16 (Default)
NUMBER_DATA_BUFFERS_RESTORE : 30 (Default)


main-media-2.Settings> NetBackup DataBuffers Size Show
SIZE_DATA_BUFFERS : 262144 (Default)
SIZE_DATA_BUFFERS_DISK : 262144 (Default)
SIZE_DATA_BUFFERS_FT:262144


dr main-media-3.Settings> NetBackup DataBuffers Size Show
SIZE_DATA_BUFFERS : 262144 (Default)
SIZE_DATA_BUFFERS_DISK : 262144 (Default)
SIZE_DATA_BUFFERS_FT : 262144 (Default)


dr main-media-2.Settings> NetBackup DataBuffers Number Show
NUMBER_DATA_BUFFERS : 30 (Default)
NUMBER_DATA_BUFFERS_DISK : 30 (Default)
NUMBER_DATA_BUFFERS_FT : 16 (Default)
NUMBER_DATA_BUFFERS_RESTORE : 30 (Default)


dr main-media-2.Settings> NetBackup DataBuffers Size Show
SIZE_DATA_BUFFERS : 262144 (Default)
SIZE_DATA_BUFFERS_DISK : 262144 (Default)
SIZE_DATA_BUFFERS_FT:262144


Your help is highly appreciated.

Below are the job details 

11/24/2014 10:52:40 AM - Info bptm(pid=27132) start            
11/24/2014 10:52:40 AM - started process bptm (27132)
11/24/2014 10:52:40 AM - Info bptm(pid=27132) start backup           
11/24/2014 10:52:41 AM - Info bpdm(pid=27149) started            
11/24/2014 10:52:41 AM - started process bpdm (27149)
11/24/2014 10:52:41 AM - Info bpdm(pid=27149) reading backup image          
11/24/2014 10:52:41 AM - Info bpdm(pid=27149) using 30 data buffers         
11/24/2014 10:52:41 AM - Info bpdm(pid=27149) requesting nbjm for media         
11/24/2014 10:52:41 AM - Info bptm(pid=27132) Waiting for mount of media id DG002 (copy 3) on server dr-media-2. 
11/24/2014 10:52:41 AM - started process bptm (27132)
11/24/2014 10:52:41 AM - mounting DG002
11/24/2014 10:52:41 AM - Info bptm(pid=27132) INF - Waiting for mount of media id DG002 on server dr-media-2 for writing.
11/24/2014 10:52:42 AM - begin reading
11/24/2014 10:52:56 AM - begin Duplicate
11/24/2014 10:52:56 AM - requesting resource LCM_dr-media-2-hcart-robot-tld-0
11/24/2014 10:52:56 AM - granted resource LCM_dr-media-2-hcart-robot-tld-0
11/24/2014 10:52:56 AM - started process RUNCMD (25835)
11/24/2014 10:52:57 AM - requesting resource dr-media-2-hcart-robot-tld-0
11/24/2014 10:52:57 AM - requesting resource @aaaaw
11/24/2014 10:52:57 AM - reserving resource @aaaaw
11/24/2014 10:52:57 AM - reserved resource @aaaaw
11/24/2014 10:52:57 AM - granted resource DG002
11/24/2014 10:52:57 AM - granted resource Drive006
11/24/2014 10:52:57 AM - granted resource dr-media-2-hcart-robot-tld-0
11/24/2014 10:52:57 AM - granted resource MediaID=@aaaaw;DiskVolume=PureDiskVolume;DiskPool=dp_disk_dr-media-3;Path=PureDiskVolume;StorageServer=dr-media-3;MediaServer=dr-media-2
11/24/2014 10:52:57 AM - ended process 0 (25835)
11/24/2014 10:53:29 AM - Info bptm(pid=27132) media id DG002  mounted on drive index 6, drivepath /dev/nst8, drivename Drive006, copy 3
11/24/2014 10:53:35 AM - Info bptm(pid=27132) waited for full buffer 0 times, delayed 0 times    
11/24/2014 10:53:36 AM - end reading; read time: 00:00:54
11/24/2014 10:53:36 AM - Info bpdm(pid=27149) completed reading backup image         
11/24/2014 10:53:36 AM - Info bpdm(pid=27149) using 30 data buffers         
11/24/2014 10:53:36 AM - begin reading
11/24/2014 10:53:40 AM - Info bptm(pid=27132) waited for full buffer 10 times, delayed 226 times    
11/24/2014 10:54:49 AM - end reading; read time: 00:01:13
11/24/2014 10:54:49 AM - Info bpdm(pid=27149) completed reading backup image         
11/24/2014 10:54:50 AM - Info bpdm(pid=27149) using 30 data buffers         
11/24/2014 10:54:50 AM - begin reading
11/24/2014 11:20:31 AM - Info bptm(pid=27132) waited for full buffer 222 times, delayed 375 times    
11/24/2014 11:27:01 AM - Info bptm(pid=27132) EXITING with status 0 <----------        
11/24/2014 11:27:01 AM - end reading; read time: 00:32:11
11/24/2014 11:27:01 AM - Info bpdm(pid=27149) completed reading backup image         
11/24/2014 11:27:01 AM - Info bpdm(pid=27149) EXITING with status 0         
11/24/2014 11:27:01 AM - Info dr-media-2(pid=27149) StorageServer=PureDisk:dr-media-3; Report=PDDO Stats for (dr-media-3): read: 1162010 KB, CR received: 1162586 KB, CR received over FC: 0 KB, dedup: 0.0%
11/24/2014 11:27:19 AM - end Duplicate; elapsed time: 00:34:23
the requested operation was successfully completed(0)

  • Duplication is going over the network:  dr-media-3 to dr-media-2-hcart-robot-tld-0

    Zone drives to dr-media-3 as well and config drives as shared.

    Local duplication should give better performance than network duplication.

    PS:
    Please always share NBU/Appliance versions in new discussions.
    Various improvements/enhancements are introduced with recent versions.

  • - That job did around 34MB/sec which is really not bad...definitely seen worse.

    - How does a backup directly written to tape perform? If slow, troubleshoot that first.

    - That said, it does appear there are waits for full buffer meaning tape was waiting on dedup.

    - Try:

    echo 256 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
    echo 1048576 > /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK
    echo 512 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
    echo 0 > /usr/openv/netbackup/NET_BUFFER_SZ
    echo 0 > /usr/openv/netbackup/NET_BUFFER_SZ_REST

    And try:

    - these 3 in contentrouter.cfg:
    PrefetchThreadNum=8 ...to speed up prefetching
    MaxNumCaches=1024 ...increases # of data container files opened avoiding frequent open/close
    ReadBufferSize=262144

     

    Above doesn't help enough, can try:
    PrefetchThreadNum=16 /usr/openv/pdde/pdcr/etc/contentrouter.cfg
    PREFETCH_SIZE = 67108864 /usr/openv/lib/ost-plugins/pd.conf



    Another item that can help....reduce the total # of concurrent tape jobs by reducing the Maximum write drives on the destination tape storage unit. Reduce that to 6, then test....then try 4 and then test....and then all the way down to two and test.

     

    Realize that if the backup image in question has poor segment locality (segments are spread across many many containers in dedup storage), then the 'read' takes much longer.

     

    If the above does not help, open a support case to have them assist with determining why certain dedup images duplicate to tape slowly.

  • Duplication is going over the network:  dr-media-3 to dr-media-2-hcart-robot-tld-0

    Zone drives to dr-media-3 as well and config drives as shared.

    Local duplication should give better performance than network duplication.

    PS:
    Please always share NBU/Appliance versions in new discussions.
    Various improvements/enhancements are introduced with recent versions.

    • andy8171's avatar
      andy8171
      Level 1

      what do you mean Local duplication should give better performance than network duplication.

      We have a quantum iscaler i3 connect directly to nbu 5220.  the rehydration is fast on exchange no grt to tape but takes hours for other slp copies.  What do you mean local duplication?  Shouldn't the images that's already on disk should goes out to tape fast since it directly connected?

      • Marianne's avatar
        Marianne
        Level 6

        This is an O-L-D discussion that was all about a particular user's specific environment where his setup was doing duplication across the network.

        Please start a new discussion where you give details of your specific environment and issues.

  • - That job did around 34MB/sec which is really not bad...definitely seen worse.

    - How does a backup directly written to tape perform? If slow, troubleshoot that first.

    - That said, it does appear there are waits for full buffer meaning tape was waiting on dedup.

    - Try:

    echo 256 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
    echo 1048576 > /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK
    echo 512 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
    echo 0 > /usr/openv/netbackup/NET_BUFFER_SZ
    echo 0 > /usr/openv/netbackup/NET_BUFFER_SZ_REST

    And try:

    - these 3 in contentrouter.cfg:
    PrefetchThreadNum=8 ...to speed up prefetching
    MaxNumCaches=1024 ...increases # of data container files opened avoiding frequent open/close
    ReadBufferSize=262144

     

    Above doesn't help enough, can try:
    PrefetchThreadNum=16 /usr/openv/pdde/pdcr/etc/contentrouter.cfg
    PREFETCH_SIZE = 67108864 /usr/openv/lib/ost-plugins/pd.conf



    Another item that can help....reduce the total # of concurrent tape jobs by reducing the Maximum write drives on the destination tape storage unit. Reduce that to 6, then test....then try 4 and then test....and then all the way down to two and test.

     

    Realize that if the backup image in question has poor segment locality (segments are spread across many many containers in dedup storage), then the 'read' takes much longer.

     

    If the above does not help, open a support case to have them assist with determining why certain dedup images duplicate to tape slowly.