Forum Discussion

Sid1987's avatar
Sid1987
Level 6
12 years ago

Optimizing duplication of SLP's

Hi Team,

  I have an SLP which usually has enormous amount of data as backlog in it. I understand a single backup image of the client in it has huge data in TB's. However how can I accomplish in making them run successfully. As many of them fail with error code 50 after replicating around 800 GB's.

Here is the LIFECYCLE_PARAMETERS content.

MIN_GB_SIZE_PER_DUPLICATION_JOB 50
MAX_GB_SIZE_PER_DUPLICATION_JOB 200
MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB 60
DUPLICATION_SESSION_INTERVAL_MINUTES 30
 

Please suggest.

Thanks

Sid

  • Do you have firewall installed Sid1987 ?

    Mark is right that nearly 2 hour elapsed before the SLP failed - however it seem that more than 2 hour can elapse without errors: 

    03/31/2013 14:35:13 - end reading; read time: 2:27:08

    I also believe you should try to reduce the frgament size - Image read time in the range of 2 hours is very long. What about disk performance - have you verified it's any good ?

  • did you try enable the multistream and try to reduce the image size and backup time?

    did you try to increase the timeout values in media server?

  • Hi Nagalla, That is a single file system which is in TB's, If I multistream it, there will be a lot of streams. So can't do that. Increasing the timeout values in media server is I guess is to make duplication successful, However I doubt that a duplication of 11TB in a single job will be accomplished by that. Thanks Sid
  • Chaining the SLP parameters will not help you making that single SLP successful. Something different is the problem. A status 50 is a "client process aborted".

    What else is show upon the status 50 ?. Can you show us the text from the detailed tab in the activity monitor ?

  • What is the fragment size of your disk storage unit? It may help to use a smaller fragment size so that the duplication runs per chunk rather than one huge chunk Hope this helps (I use 5000MB for disk as it helps with GRT backups too)
  • 11 TB in single image, i am curious to know how much time you are backup job is taking to compleate it.

  • I use Flashbackup or Synthetic fulls for all of my LARGE systems. They both work great
  • Hi Nicolai, Sorry for the late reply, Here is the detailed status of the failed duplication job. 03/31/2013 10:50:06 - requesting resource LCM_XYZ 03/31/2013 10:50:35 - granted resource LCM_XYZ 03/31/2013 10:50:35 - started process RUNCMD (pid=3416) 03/31/2013 10:50:36 - begin Duplicate 03/31/2013 10:50:36 - ended process 0 (pid=3416) 03/31/2013 10:50:37 - requesting resource XYZ 03/31/2013 10:50:37 - requesting resource @aaaa5 03/31/2013 10:50:37 - reserving resource @aaaa5 03/31/2013 10:51:03 - resource @aaaa5 reserved 03/31/2013 10:51:03 - granted resource MediaID=@aaaa7;DiskVolume=xxx;DiskPool=yyy;Path=xxx;StorageServer=yyyy;MediaServer=xyz 03/31/2013 10:51:03 - granted resource XXX 03/31/2013 10:51:03 - granted resource MediaID=@aaaa5;DiskVolume=tttt;DiskPool=tttt;Path=ttt;StorageServer=tttt;MediaServer=xyz 03/31/2013 10:51:05 - Info bptm (pid=3132) start 03/31/2013 10:51:05 - started process bptm (pid=3132) 03/31/2013 10:51:09 - Info bptm (pid=3132) start backup 03/31/2013 10:51:11 - Info bpdm (pid=3135) started 03/31/2013 10:51:11 - started process bpdm (pid=3135) 03/31/2013 10:51:11 - Info bpdm (pid=3135) reading backup image 03/31/2013 10:51:11 - Info bpdm (pid=3135) using 30 data buffers 03/31/2013 10:51:11 - Info bpdm (pid=3135) requesting nbjm for media 03/31/2013 10:51:13 - begin reading 03/31/2013 12:08:05 - end reading; read time: 1:16:52 03/31/2013 12:08:05 - begin reading 03/31/2013 14:35:13 - end reading; read time: 2:27:08 03/31/2013 14:35:13 - begin reading 03/31/2013 16:31:55 - end reading; read time: 1:56:42 03/31/2013 16:31:55 - begin reading 03/31/2013 18:06:20 - end reading; read time: 1:34:25 03/31/2013 18:06:20 - begin reading 03/31/2013 19:37:00 - end reading; read time: 1:30:40 03/31/2013 19:37:00 - begin reading 03/31/2013 20:41:02 - end reading; read time: 1:04:02 03/31/2013 20:41:02 - begin reading 03/31/2013 21:58:20 - end reading; read time: 1:17:18 03/31/2013 21:58:20 - begin reading 03/31/2013 23:47:29 - end reading; read time: 1:49:09 03/31/2013 23:47:29 - begin reading 04/01/2013 01:40:26 - Error bptm (pid=3132) media manager terminated by parent process 04/01/2013 01:40:28 - Error bpdm (pid=3135) media manager terminated by parent process 04/01/2013 01:40:37 - Error bpdm (pid=3135) media manager terminated by parent process 04/01/2013 01:41:37 - Error bpduplicate (pid=3416) Duplicate of backupid xyz_1364704049 failed, termination requested by administrator (150). 04/01/2013 01:41:37 - Error bpduplicate (pid=3416) Status = no images were successfully processed. 04/01/2013 01:41:39 - end Duplicate; elapsed time 14:51:03 client process aborted (50)
  • Apart from the fragment size which may help as it will register status of the process mroe regularly this may be caused by a keep alive setting - just about a 2 hour break before it fails It may be just a media server client read timeout that needs increasing but you may need to increase the keep alive settings - how you do that will depend on your operating system - I dont see that referenced anywhere so tell us what you Master and Media Servers O/S is and I will try and guide you on dealing with the Keep Alive settings
  • Do you have firewall installed Sid1987 ?

    Mark is right that nearly 2 hour elapsed before the SLP failed - however it seem that more than 2 hour can elapse without errors: 

    03/31/2013 14:35:13 - end reading; read time: 2:27:08

    I also believe you should try to reduce the frgament size - Image read time in the range of 2 hours is very long. What about disk performance - have you verified it's any good ?