Netbackup SLP activity monitor
Hello, is there a way to check the status of an SLP that is in progress? i would like to know stuff like %completion time, GB transferred, current kb/s. are there some best practices/tuning tips to improve performance of optimize deduplication between remote 5230 appliance and datacenter appliance (5230) over slow WAN links? is there a way to calculate possible throughput? thank you, florinSolved4.1KViews0likes6CommentsNBU 5220 Performance
We recently deployed a 5220 appliance into our environment as it was to be the savior in our battle against a backup window we were no longer able to meet. When we finally got it online and into our NBU environment the initial performance was great. The area we were to most benefit was with VMware backups. The data stores mounted directly to the appliance would allow direct access to the snapshots for a fast and efficient backup. Before this, we were performing client side backups so the impact on the hosts every night was significant as we tried backing up 600+ vms. The plan was to be able to move all Dev and test off-host backups to the middle of the day as the performance impact was minimal and the end result was an increased window to complete things. As we started with noon-time backups deduplication rates were high and so were the speeds. However, this performance gain was short lived, as we began increasing the load we suddenly saw performance drop to a point of concern. Backups were no longer speedy, 3500KB per second to 10,000KB. There are some that might pop to 24,000KB, but in a sample size of 15 as I write this, only one is showing 24,000. Now, I do have a few ideas. 1. We have a 72TB appliance, therefore, 2 disk trays, during the backups, the one disk tray is going crazy, all the lights are flashing and you can really see that it is working. However, the second tray is doing nothing. While you might see a blink here or there, it is almost nothing compared to the other disk tray. Is this to be expected? When we looked at the disk configuration, it shows concat, is this normal? 2. Too much data at once and we are simply burying the appliance. In reality, what sort of performance should I be able to expect from the appliance? 3. Relating to number 2, since we only have the one appliance right now while we wait to get the remote appliance in place, we are duping off to tape. This is running at the same time as a backup, so this means at the same time the appliance is writing a lot of data, it is also reading it back to tape. 4. We are overloading the data store so that read speed is bad from source to destination. We have fewer hosts,therefore, if we limit the jobs per host we limit the number of machines backing up at once (obviously). This means that backups take way long, so we removed the limit per host and just set a limit per data store. As we are new to it all, I am not sure what impact where, but again, I am trying to list any and all ideas from the start. 5. The appliance does not support multi-pathing, therefore, we only have a single path to the disk. Beyond that I am not sure, but this is something that doesnt help with the showcasing of the appliances to management at the moment. However, given the initial performance I am confident we can get back there.Solved3.2KViews4likes50CommentsNetbackup Appliance Slow Rehydration to tape and bad backup performance
Help! Netbackup Appliances 5230 2.6.0.2. One of our appliances houses ms-standard windows all local drives backups and vmware snapshot backups. Backups run at good speeds however as soon as we start our slps to copy the data from the msdp to tape the backups get extremely slow to barely moving and the copy to tape is slow. As soon as we kill off the slps backup speed goes back up again. We have a case opened and backline engineers have taken a look and we were told that is expected as the rehydration process tapeouts is slowing things down. We turned our max concurrent write drives to 4 and set our slp schedule to daytime to not affect nighttime backups however with these settings we will never get the month-end to tape done. We thought of a couple other options 1. backup to advanced disk then slp to msdp and or tape-issue here is we would need to buy more trays but even more important I think we would lose some features in Netbackup. last I checked netbackup accelerator doesnt work to advanced disk(I need to double-check) 2. inline tape copies-during month-end use inline tape copies to backup to msdp and tape, issue is speed will be run as the slowest backup between the two but that might be okay if the speed is good. Does anyone have any suggestions on this and been through this? Thanks.Solved2.9KViews0likes13CommentsBuffer Settings for Duplication to LTO 5 Tape
We're using a Netbackup 5230 Appliance connected via 8Gb fibre to an HP MSL 8096 LTO 5 tape library with 4 drives. I've read many posts recommending leaving the buffer size as default: SIZE_DATA_BUFFERS : 262144 (Default) SIZE_DATA_BUFFERS_DISK : 262144 (Default) SIZE_DATA_BUFFERS_FT : 262144 (Default) This is how they are currently set, and when monitoring backups to disk, I see the following in the log: 01/29/2014 19:49:09 - Info bptm (pid=3752) using 262144 data buffer size 01/29/2014 19:49:09 - Info bptm (pid=3752) using 30 data buffers Which is expected, however, when duplicating to tape, the logs indicate the following: 01/31/2014 17:00:15 - Info bptm (pid=30607) using 65536 data buffer size 01/31/2014 17:00:15 - Info bptm (pid=30607) using 30 data buffers Can someone clarify why it would be set this way and is this as expected? ThanksSolved2.6KViews2likes13Comments5230 Appliance Send Mail Notification
Hello Team, We used appliance for backup i wan to configure send mail notification on master server admin guı not Ops Center? This can be possible i successd for appliance web page and sent it e-mail for users but same time i tried to send e-mail notification from appliance master server gui which does not work ..... Could you please inform me about "send mail notification for appliance" BR KaccoSolved2.6KViews1like2CommentsVMware servers get Info nbrb (pid=) Limit has been reached for the logical resource VMware.Datastore
i'm running 5220 appliance with 2.5.1, netbackup 7.5.4 on linux. we are backing to to disk. we have a policy that quieries a VM cluster and pulls in many servers. They start running but after about the 10th one the snapshot details will state" Info nbrb (pid=???) Limit has been reached for the logical resource primaster.VMware.Datastore.vmwplz-XIV02-LUN14. this happens on a lot of them but will eventually continue. I have worked with my support and changed number of jobs from 20 to 40 and on my netbackup master & media servers. I have changed the parms NUMBER_DATA_BUFFERS_DISK and NUMBER_DATA_BUFFERS_DISK. 64 and 1048576 , 64 and 262144, 128 and 1048576 , 128 and 262144 combinations. i have moved policies to stagger between the master & media appliances to balance but some servers take the same amount of time no matter which appliance. i cant see any changes in time or performance and wanted to know what this message may point to :Limit has been reached for the logical resource primaster.VMware.Datastore.vmwplz-XIV02-LUN14. also i ran a test during the day when of course nothing else was running and the time cut in half. But the test only selected this server. i'm wondering if instead of 1 policy selecting all VM's, if breaking them into 2 policies , or is something in VM . i'm looking for guidance on where to find the bottleneck. thanksSolved2.5KViews1like4CommentsAIR Replication Jobs are getting queued.
We are experiencing a huge AIR replication job queue over the last two weeks and also many active jobs are running very slow. Also observed few of the oldest replications are failing with error code 227.My questions are ... 1.Is there any way we can improve the performance of the replication jobs considering the below environment settings. 2.Is there any way we can control the number of active replication jobs running at any point of time ? Please help me...Thanks in advance. Environment Details: Master / Media Server : Netbackup Appliance 5220 2.5.2 Netbackup Version : 7.5.0.5 Replication with AIR Replication Bandwidth limitation : master1:/disk/etc/puredisk # cat agent.cfg | grep bandwidth # A bandwidth limit, in KiB/sec. bandwidthlimit=1280 SLP parameters: master1:/usr/openv/netbackup/db/config # cat LIFECYCLE_PARAMETERS AUTO_CREATE_IMPORT_SLP = 1 MAX_GB_SIZE_PER_DUPLICATION_JOB = 100 MIN_GB_SIZE_PER_DUPLICATION_JOB = 25 Replication failed with 227 (detailed log): 06/17/2013 01:11:48 - requesting resource LCM_stu_disk_master1 06/17/2013 01:11:48 - Info nbrb (pid=21872) Limit has been reached for the logical resource LCM_stu_disk_master1 07/10/2013 02:42:21 - granted resource LCM_stu_disk_master1 07/10/2013 02:42:23 - started process RUNCMD (pid=6070) 07/10/2013 02:42:24 - Info bpdm (pid=6101) started 07/10/2013 02:42:24 - started process bpdm (pid=6101) 07/10/2013 02:42:24 - requesting resource @aaaac 07/10/2013 02:42:24 - reserving resource @aaaac 07/10/2013 02:42:24 - resource @aaaac reserved 07/10/2013 02:42:24 - granted resource MediaID=@aaaac;DiskVolume=PureDiskVolume;DiskPool=dp_disk_master1;Path=PureDiskVolume;StorageServer=master1;MediaServer=master1 07/10/2013 02:44:42 - Info master1 (pid=6101) Using OpenStorage to replicate backup id Client1-db_1371393521, media id @aaaac, storage server master1, disk volume PureDiskVolume 07/10/2013 02:44:42 - Info master1 (pid=6101) Replicating images to target storage server hkx1bak03.apac.experian.local, disk volume PureDiskVolume 07/17/2013 11:19:46 - Info master1 (pid=6101) StorageServer=PureDisk:master1; Report=PDDO Stats for (master1): scanned: 24790571 KB, CR sent: 24838491 KB, CR sent over FC: 0 KB, dedup: 0.0% 07/17/2013 11:19:46 - Info bpdm (pid=6101) EXITING with status 0 07/17/2013 11:19:46 - Replicated backup id Client1-db_1371393521 successfully 07/17/2013 11:19:47 - Info bpdm (pid=3444) started 07/17/2013 11:19:47 - started process bpdm (pid=3444) 07/17/2013 11:19:47 - requesting resource @aaaac 07/17/2013 11:19:47 - granted resource MediaID=@aaaac;DiskVolume=PureDiskVolume;DiskPool=dp_disk_master1;Path=PureDiskVolume;StorageServer=master1;MediaServer=master1 07/17/2013 11:21:29 - Info master1 (pid=3444) Using OpenStorage to replicate backup id Client1-db_1371393625, media id @aaaac, storage server master1, disk volume PureDiskVolume 07/17/2013 11:21:30 - Info master1 (pid=3444) Replicating images to target storage server hkx1bak03.apac.experian.local, disk volume PureDiskVolume 07/19/2013 02:55:00 - Info master1 (pid=3444) StorageServer=PureDisk:master1; Report=PDDO Stats for (master1): scanned: 4 KB, CR sent: 1 KB, CR sent over FC: 0 KB, dedup: 75.0% 07/19/2013 02:55:00 - Info bpdm (pid=3444) EXITING with status 0 07/19/2013 02:55:00 - Error nbreplicate (pid=6070) Failed to update image copy state for BID Client1-db_1371393625, replica copy 102. EMM error code = 2020005. Replication WAS successful no entity was found (227)Solved2.1KViews1like3CommentsDuplication to tape slow
Running a 5220 with 7.5.0.6. I'm using SLP's to replictate to a Master appliance at our DR site and to duplicate to tape, going to IBM.ULTRIUM-TD6 tape drives. Starting last week my dupliction jobs are running slow and showing records below in the job log. Not sure where else to look to troubleshoot. thanks Pat 07/13/2014 21:36:17 - begin reading 07/13/2014 22:12:19 - end reading; read time: 0:36:02 07/13/2014 22:12:20 - begin reading 07/13/2014 22:50:45 - end reading; read time: 0:38:25 07/13/2014 22:50:46 - begin reading 07/13/2014 23:52:42 - end reading; read time: 1:01:56 07/13/2014 23:52:42 - begin reading 07/14/2014 00:41:53 - end reading; read time: 0:49:11 07/14/2014 00:41:54 - begin reading 07/14/2014 01:34:46 - end reading; read time: 0:52:52 07/14/2014 01:34:46 - begin reading 07/14/2014 02:24:04 - end reading; read time: 0:49:18Solved2.1KViews1like9Commentsnbostpxy.exe multiple threads on Windows 2008 R2 server
We have NetBackup 7.6.0.4 running on Appliance 5220 ver. 2.6.0.4. Recently some resource issues forced us review the processes on one of the SQL cluster node and found the (see attached) nbostpxy.exe, each eating up 6% CPU utilization. I stopped the client services and killed all these threads manually. However, the next time back kicked in, the same threads appeared and peaked CPU utilization. I understand that this process is involved with deduplication processing on the client, but we are not running any client side deduplication. We have media server deduplication configured with Appliances 5220. Is this kind of bug, please share your thoughts.Solved1.9KViews0likes5Commentsduplications from wrong media server
I have two 5200 appliances (2.0.1) one is in our primary data center and one is at our DR site connected by a 43mb pipe. My primary appliance has two LTO tape libraries directly attached via fiber that are used for doing monthly backups. I am using SLP's for the back up and duplications. Every weekend we do a full backup (1.3tb) that is backed up to my primary appliance and then duplicated to my DR appliance. During this backup i notice that my media server that is being used is primary > primary (for backup) and DR > DR (for duplications). This is perfectly fine. Now heres the trouble. When i mix a duplication to tape in this process the system trys to load balance the duplication between my media servers. Meaning: One duplication job will show the primary as the media server for the tape backup and one job will show the DR appliance. This would all be ok if the DR appliance was on my local LAN but it is not and therefore any duplication using it as the media server to tape can take over 24 hours to complete. In the SLP for my monthlys I have removed the DR appliance as a duplication so i dont know why it keeps responding. Example: i had a 245gb exchange DB duplication that was using my DR appliance as the media server and it had been running for 48hours and was only 60% complete. I noticed the job and cancled it. When it retried it picked up my primary appliance as the media server and in 25min it was complete (over fiber channel). I opened a ticket with support 2 months ago but we have yet to figure out the deal. This hasnt been that big of an issue until another group wanted to replicate some VTL's to the DR and realized that the bandwith is 100% consumed all the time by netbackup. i have been tinkering with this for months and cannot figure out what i have configured wrong so i want to consult the almight Community for answers. LOL! To me it sounds like the media servers are setup so that they can load balance but i cant find were to disable that.Solved1.9KViews4likes6Comments