cancel
Showing results for 
Search instead for 
Did you mean: 

Solaris Master/Media Server

hanskniep
Level 4

Hello,

Have a Solaris master/media server with 4 CPUs (1500Mhz) and 32 GB of RAM that I inherited running NBU 6.5.  I am only able to run 45 streams at a time.  When I try to raise this limit, all the jobs fail with resource errors.  I want to have the O/S team monitor the server, but unsure on specifically what to have them look at. 

I know NBU 6.5 is EOSL, but can't upgrade 'till we are sure the hardware is sufficient.  So, need help to see how many clients this shuld support and how to hav the OS team help me.

 

Thanks,

1 ACCEPTED SOLUTION

Accepted Solutions

Genericus
Moderator
Moderator
   VIP   

Review the links Marianne has provided. Each child process uses resources. You need to make sure you have configured your server and NetBackup properly.

Check your default Project, or your NetBackup Project - what are the settings?

Check /etc/system - you may have settings there as well.

 

I have a solaris master and I have well over 200 jobs active with no issues.

But I have more memory and I am more up to date. I also write to a 990 - and that data flies!

NetBackup 9.1.0.1 on Solaris 11, writing to Data Domain 9800 7.7.4.0
duplicating via SLP to LTO5 & LTO8 in SL8500 via ACSLS

View solution in original post

11 REPLIES 11

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

Maximun number of concurrent jobs which the master server can handle depends on various factors. But in my experience, SPARC Solaris server with less core and less memory can handle more jobs. You should firest identify which type of resource is running short by lokking into job details, bperror, and debug logs.

By the way, you should upgrade your NetBackup to currently supported version even if hardware is dufficient. Upgrade of NetBackup does not neccesaliry require hardware refresh or upgrade.

Marianne
Level 6
Partner    VIP    Accredited Certified
45 streams backing up to what kind of devices and how many? And what kind of throughput per device? All of these streams across the network? It really sounds to me as if you need at least one more media server... A proper sizing exercise is needed here - amount of clients, data size per client, backup window, network bandwidth, available devices, etc... etc... And yes, you should upgrade as a matter of urgency. NBU 6.5 ran out of support in 2012.

Andy_Welburn
Level 6

A SPARC 6.5 Master/Media with 2x 1280MHz CPU & 6Gb memory - I'm sure we managed more than 45 streams at once over the many years that this system was actually in *full* production.

 

When I try to raise this limit, all the jobs fail with resource errors

 

Be interested to see these errors ....

Marianne
Level 6
Partner    VIP    Accredited Certified

My understanding was that a single master/media server is writing 45 streams simultaneously...

Now, think of the data path (assuming that most master/media servers still only have a single 1Gb NIC).
How many network connections from clients can this 1Gb accommodate, all trying to transfer data at +- 10Mb/sec?
Well, if 45 clients are transferring data at 2Mb/sec each, then 45 streams will not be a problem.
The 1Gb NIC will just about be saturated and can stream a single LTO3 @ 90Mb/sec.

A master server with those specs with 10Gb NIC can handle a lot more incoming data from clients.
It all comes down to network infrastructure and the amount and speed of data transfer from the clients.

As per my previous post - a proper sizing exercise is needed to see where the bottleneck/resource issue is.
It could be a simple matter of insufficient shared memory configured at OS level.

The Planning and Performance Tuning Guides will help you:

Overview of NetBackup performance testing. 
http://www.symantec.com/docs/TECH147296

Symantec NetBackup™ 7.0 - 7.1 Backup Planning and Performance Tuning Guide 
http://www.symantec.com/docs/DOC4483

Key performance considerations for NetBackup 7.5 master servers http://www.symantec.com/docs/TECH202840

Performance issues observed when running UltraSparc-T series architecture on a NetBackup 7.x Master Server 
http://www.symantec.com/docs/TECH204332

Designing your backup system 
http://www.symantec.com/docs/HOWTO56078

Improving NetBackup backup performance 
http://www.symantec.com/docs/HOWTO44836  

NetBackup™ Architecture Overview 
https://www-secure.symantec.com/connect/sites/default/files/b-whitepaper_nbu_architecture_overview_1...

NetBackup 7.6 Best Practices: Optimizing Performance

Updated NetBackup Backup Planning and Performance Tuning Guide for Release 7.5 and Release 7.6 

(Not sure if all the HOWTO links are still working - some have been expired and SYMwise seems to be offline at the moment...)

hanskniep
Level 4

More info, all the clients are workstations and have 1GB link at max, most likely 100MB.  The Master /media is writing to a newer DataDomain DD990.  No tape. Upgrading to a newer version is not possible at this time.

Genericus
Moderator
Moderator
   VIP   

Review the links Marianne has provided. Each child process uses resources. You need to make sure you have configured your server and NetBackup properly.

Check your default Project, or your NetBackup Project - what are the settings?

Check /etc/system - you may have settings there as well.

 

I have a solaris master and I have well over 200 jobs active with no issues.

But I have more memory and I am more up to date. I also write to a 990 - and that data flies!

NetBackup 9.1.0.1 on Solaris 11, writing to Data Domain 9800 7.7.4.0
duplicating via SLP to LTO5 & LTO8 in SL8500 via ACSLS

Genericus
Moderator
Moderator
   VIP   

Make sure you are counting all child jobs.

I set up a script that runs from crontab

essentially uses bpdbjobs to get infomations

#!/bin/ksh
# this runs the bpdbjobs command, assigns field values
# stores the output in the OUTFILE

OUTFILE=/usr/openv/tmp/job.counts.`date "+%Y%m"`
DATE=`date "+%m/%d/%Y"`

HOUR=`date "+%H%M"`

## information about output and what each field is...

#/usr/openv/netbackup/bin/admincmd/bpdbjobs -ignore_parent_jobs -most_columns | sed "s/,/ /g" |while read F01 F02 F03 F04 F05 F06 F07 F08 F09 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 F21 F22 F23 F24 F25 F26 F27 F28 F29 F30 F31 F32 F33 F34 F35 F36 F37 F38 F39 F40 F41 F42 F43 F44 F45 F46 F47 F48 F49 F50
#jobid,jobtype,state,status,policy,schedule,client,server,started,elapsed,ended,stunit,try,operation,kbytes,files,pathlastwritten,percent,jobpid,owner,subtype,classtype,schedule_type,priority,group,masterserver,retentionunits,retentionperiod,compression,kbyteslastwritten,fileslastwritten,filelistcount,[files]...,trycount,[trypid,trystunit,tryserver,trystarted,tryelapsed,tryended,trystatus,trystatusdescription,trystatuscount,[trystatuslines]...,trybyteswritten,tryfileswritten]...parentjob,kbpersec,copy,robot,vault,profile,session,ejecttapes,srcstunit,srcserver,srcmedia,dstmedia,stream,suspendable,resumable,restartable,datamovement,snapshot,backupid,killable,controllinghost
### Field 02 = jobtype
 # 0=backup, 1=archive, 2=restore, 3=verify, 4=duplicate,   5=import,   6=catalog   backup,   7=vault,
 # 8=label,  9=erase,  10=tape   request,   11=clean, 12=format tape, 13=physical inventory, 14=qualification
### Field 03 = state # 0=queued, 1=active, 2=wait for retry, 3=done
### Field 04 = status code
##
 

I use SLP so I want to see how many are active or queued

SLPREPORT=`/usr/openv/netbackup/bin/admincmd/nbstlutil report | grep "Total: "| cut -c60-80`
## find queued duplications
QUEUED=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep -c "4,0,,SLP"`
EDLAQ=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep "4,0,,SLP" | grep -c med04np`
EDLBQ=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep "4,0,,SLP" | grep -c med03np`
## find active duplications  # change 4,1 to 4,,
ACTIVE=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep -c "4,1,,SLP"`
EDLAA=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep "4,,,SLP" | grep -c med04np`
EDLBA=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep "4,,,SLP" | grep -c med03np`
## find queued and active backups
BACKUPQ=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep -c "0,0,,"`
BACKUPA=`/usr/openv/netbackup/bin/admincmd/bpdbjobs -most_columns | cut -d, -f 2,3,4,5,6,12,24,57 | grep -c "0,1,,"`
###
## echo DATE\     HOUR\   QUEUED\       EDLAQ\ EDLBQ\ ACTIVE\        EDLAA\ EDLBA BACKUPQ\      BACKUPA\        SLPSUM\ SLPREPORT  >> $OUTFILE
echo $DATE\     $HOUR\  $QUEUED\        $EDLAQ\ $EDLBQ\ $ACTIVE\        $EDLAA\ $EDLBA\ $BACKUPQ\       $BACKUPA\       $SLPSUM\        $SLPREPORT >> $OUTFILE
 

I run this every 15 minutes to a file, then I can tail -f that file and see the job counts.

 

NetBackup 9.1.0.1 on Solaris 11, writing to Data Domain 9800 7.7.4.0
duplicating via SLP to LTO5 & LTO8 in SL8500 via ACSLS

Andrew_Madsen
Level 6
Partner

Also validate your memory tuning is sufficient as well.

Look at this guide. I realize it is for 7.5 but there is some good memory information in it especially if you are using Solaris 10.

http://www.symantec.com/business/support/index?page=content&id=doc7449

 

Andrew_Madsen
Level 6
Partner

hanskniep
Level 4

I found this technote:

http://www.symantec.com/business/support/index?page=content&id=TECH62633

Going to have the O/S team verify this is set.

 

Thanks all for your responses!

Marianne
Level 6
Partner    VIP    Accredited Certified

As per my post above, right?

It could be a simple matter of insufficient shared memory configured at OS level.