CPU at 100% during oracle database backup

Question

We are facing the issue of high CPU usage (reaches 100%)&nbsp;on the net-backup version 7.5.0.7, and due to that major backups are failing.

First step: I have changed twice the /opt/SYMCOpsCenterServer/db/conf/server.conf from

-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 256M -ch 1024M -cl 256M -zl -os 1M -m
	
	To:

-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 512M -ch 2048M -cl 256M -zl -os 1M -m
	&nbsp;

and:

-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gtc 4 -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 1G -ch 2048M -cl 1G -zl -os 1M -m

Restarted each time the OpsCenter server:
	/opt/SYMCOpsCenterServer/bin/opsadmin.sh stop/start

But this didn't fix the issue.

Second step, I have made the change on the below file:
	cat /usr/openv/var/global/server.conf
	&nbsp;-n NB_masterserverBZ
	&nbsp;&nbsp; -x tcpip(LocalOnly=YES;ServerPort=13785)&nbsp; -gp 4096 -gd DBA -gk DBA -gl DBA -ti 0 -c 100M -ch 2048M -cl 100M -zl -os 1M&nbsp; -o /usr/openv/db//log/server.log&nbsp;&nbsp;-ud -m

Then to this:

&nbsp;-n NB_masterserverBZ
	&nbsp;&nbsp; -x tcpip(LocalOnly=YES;ServerPort=13785)&nbsp; -gp 4096 -gd DBA -gk DBA -gl DBA -gn 32 -ti 0 -c 100M -ch 3G -cl 100M -zl -os 1M&nbsp; -o /usr/openv/db//log/server.log
	&nbsp;-ud -m
	&nbsp;

Still the issue is there.

Third step: Did the backup of NBDB and defrag:

/opt/SYMCOpsCenterServer/bin/dbbackup.sh /my_db_backup_dir

/opt/SYMCOpsCenterServer/bin/./dbdefrag.sh

Fourth step:

1- CREATED the file (missing) -- /usr/openv/var/global/emm.conf
	&nbsp;&nbsp; Added contents--
	&nbsp;
	NUM_DB_BROWSE_CONNECTIONS=20
	&nbsp;
	NUM_DB_CONNECTIONS=21
	&nbsp;
	NUM_ORB_THREADS=35

2- CREATED the file (missing)--&nbsp;&nbsp;&nbsp; /usr/openv/var/global/nbrb.conf
	&nbsp;
	Added contents--
	&nbsp;
	SECONDS_FOR_EVAL_LOOP_RELEASE = 180
	&nbsp;
	RESPECT_REQUEST_PRIORITY = 0
	&nbsp;
	DO_INTERMITTENT_UNLOADS = 1

# /usr/bin/ulimit -a
	time(seconds)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unlimited
	file(blocks)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unlimited
	data(kbytes)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unlimited
	stack(kbytes)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 8192
	coredump(blocks)&nbsp;&nbsp;&nbsp;&nbsp; unlimited
	nofiles(descriptors) 256
	vmemory(kbytes)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unlimited

ls -l /opt/SYMCOpsCenterServer/db/data/

-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp; 181116928 Jan 18 09:58 symcOpscache.db
	-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 139264 Jan 13 10:53 symcopsscratchdb.db
	-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp; 1089536 Jan 13 10:53 symcsearchdb.db
	drwxr-xr-x&nbsp;&nbsp; 2 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 512 Jan 13 10:19 temp
	-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp; 1048797184 Jan 18 10:19 vxpmdb.db
	-rw-r--r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 589824 Jan 18 10:24 vxpmdb.log
	&nbsp;

Find attached outputs from following command while the issue happens:

/bin/vmstat 30 1

/bin/prstat -a 5 2

/bin/iostat -xntcz&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

/bin/ps -e -o pcpu,pid,ppid,args | /bin/sort -rn | /bin/head -50

/opt/openv/netbackup/bin/bpps -x

At last, here are some informationtaken from the netbackup master server:

cat /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
	128
	cat /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
	128

ndd -get /dev/tcp tcp_max_buf
	16777216
	ndd -get /dev/tcp tcp_xmit_hiwat
	2097152
	# ndd -get /dev/tcp tcp_recv_hiwat
	2097152
	# ndd -get /dev/tcp tcp_wscale_always
	1

Here below&nbsp;from bptm.log

04:00:03.399 [7398] &lt;2&gt; read_config_file: using 262144 value from /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK
	04:00:03.399 [7398] &lt;2&gt; io_init: using 262144 data buffer size
	04:00:03.399 [7398] &lt;2&gt; set_job_details: Tfile (718113): LOG 1452826803 4 bptm 7398 using 262144 data buffer size

.........

04:00:03.399 [7398] &lt;2&gt; read_config_file: using 0 value from /usr/openv/netbackup/NET_BUFFER_SZ
	04:00:03.399 [7398] &lt;2&gt; io_set_recvbuf: NOT doing setsockopt() to set network buffer size
	04:00:03.399 [7398] &lt;2&gt; io_set_recvbuf: receive network buffer is 2102260 bytes
	04:00:03.405 [7398] &lt;2&gt; read_config_file: using 128 value from /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
	04:00:03.417 [7398] &lt;2&gt; read_config_file: using 128 value from /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
	04:00:03.417 [7398] &lt;2&gt; io_init: using 128 data buffers
	04:00:03.417 [7398] &lt;2&gt; set_job_details: Tfile (718113): LOG 1452826803 4 bptm 7398 using 128 data buffers

..........

04:00:08.088 [7438] &lt;2&gt; bptm: INITIATING (VERBOSE = 5): -w -c masterserverBZ -dpath master_su -stunit dd7200_master_su -cl sdp1a -bt 1452826806 -b masterserverBZ_1452826806 -st 0 -cj 1 -reqid -1452
	605272 -jm -brm -hostname masterserverBZ -ru root -rclnt masterserverBZ -rclnthostname masterserverBZ -rl 0 -rp 604800 -sl sdp1a -ct 0 -maxfrag 524288 -eari 0 -mediasvr masterserverBZ -no_callback
	-connect_options 0x01010100 -jobid 718114 -jobgrpid 718114 -masterversion 750000 -bpbrm_shm_id 1174405198 -blks_per_buffer 512 -shm
	04:00:08.089 [7438] &lt;2&gt; main: bptm.c.1601: maximum fragment size is 524288000 Kbytes

........

04:00:03.417 [7398] &lt;2&gt; io_init: child delay = 10, parent delay = 15 (milliseconds)

Thanks for your help.
	&nbsp;

riaanbadenhorst · Accepted Answer

Hi,

You didn't mention your server specs.

You're running OpsCenter on your Master server, you should migrate to a dedicated server&nbsp;if you're facing performance issues.

You're also running verbose logging, I don't figures on the impact on performance but one can assume having minimal logging will use less CPU than verbose logging, or atleast less memory.

sdo · Answer

Although this says you can:

http://eval.symantec.com/mktginfo/enterprise/white_papers/b-nbu_7_opscenter_analytics_WP.en-us.pdf

...I would take Riaan' advice, and the advice from here:

https://www.veritas.com/community/forums/running-opscenter-media-server

In practice I would never run OpsCenter Server on any NetBackup Server (master or media), even in a small virtual test environment.

nicolai · Answer

OpsCenter is very CPU greedy.

The attachment show plenty of free memory and almost no "wait for I/O".

However CPU may be a issue:

load averages: 4.94, 1.83, 1.05
	load averages: 6.37, 2.19, 1.17

How may cores do you have ?

The number of cores has a incluence of the&nbsp;interpretation of load average.

&nbsp;

eomaber · Answer

@Nicolai,

I have four cores:

==================================== CPUs ====================================

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CPU&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CPU&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Run&nbsp;&nbsp;&nbsp; L2$&nbsp;&nbsp;&nbsp; CPU&nbsp;&nbsp; CPU
	LSB&nbsp;&nbsp; Chip&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ID&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MHz&nbsp;&nbsp;&nbsp;&nbsp; MB&nbsp;&nbsp;&nbsp; Impl. Mask
	---&nbsp;&nbsp; ----&nbsp; ----------------------------------------&nbsp; ----&nbsp;&nbsp; ---&nbsp;&nbsp;&nbsp; ----- ----
	&nbsp;00&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0,&nbsp;&nbsp; 1,&nbsp;&nbsp; 2,&nbsp;&nbsp; 3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2750&nbsp;&nbsp; 5.0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 7&nbsp; 161

&nbsp;

@Riaan,

I have increased the logging level just to troubleshoot the issue.

@sdo,

I will check the links.

nicolai · Answer

load averages: 4.94, 1.83, 1.05
	load averages: 6.37, 2.19, 1.17

With 4 cores a load average of 4 means all CPU are 100 utilized. This mean during the 1 minute sample that CPU's are overloaded respective 94% and 237%.

No problem during 5 and 15 minutes sample - please note its&nbsp;average, you could still have CPU spikes.

As a workaround, I would suspend&nbsp;OpsCenter data collection in the time frame where backup load is at it highest.

eomaber · Answer

@Nicolai,

I have stopped the OpsCenter server with:

/opt/SYMCOpsCenterServer/bin/opsadmin.sh stop

But still, I got the CPU spikes.

Forum Discussion

CPU at 100% during oracle database backup

12 Replies

Related Content

Oracle database restore from vmware type backup

Oracle database restore fail

Oracle to Netbackup Copilot

oracle database backup error :

BackupExec for Oracle 23ai?

Recent Discussions

command: bperror

MS-SharePoint policy restore error (2804) .

How to restore a backup

How to configure RBAC

10 years old netbackup appliance database service down, ssl certification out date