CPU at 100% during oracle database backup
We are facing the issue of high CPU usage (reaches 100%) on the net-backup version 7.5.0.7, and due to that major backups are failing.
First step: I have changed twice the /opt/SYMCOpsCenterServer/db/conf/server.conf from
-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 256M -ch 1024M -cl 256M -zl -os 1M -m
To:
-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 512M -ch 2048M -cl 256M -zl -os 1M -m
and:
-n OPSCENTER_masterserverBZ -x tcpip(LocalOnly=YES;BROADCASTLISTENER=0;DOBROADCAST=NO;ServerPort=13786;) -gtc 4 -gd DBA -gk DBA -gl DBA -gp 8192 -ti 0 -c 1G -ch 2048M -cl 1G -zl -os 1M -m
Restarted each time the OpsCenter server:
/opt/SYMCOpsCenterServer/bin/opsadmin.sh stop/start
But this didn't fix the issue.
Second step, I have made the change on the below file:
cat /usr/openv/var/global/server.conf
-n NB_masterserverBZ
-x tcpip(LocalOnly=YES;ServerPort=13785) -gp 4096 -gd DBA -gk DBA -gl DBA -ti 0 -c 100M -ch 2048M -cl 100M -zl -os 1M -o /usr/openv/db//log/server.log -ud -m
Then to this:
-n NB_masterserverBZ
-x tcpip(LocalOnly=YES;ServerPort=13785) -gp 4096 -gd DBA -gk DBA -gl DBA -gn 32 -ti 0 -c 100M -ch 3G -cl 100M -zl -os 1M -o /usr/openv/db//log/server.log
-ud -m
Still the issue is there.
Third step: Did the backup of NBDB and defrag:
/opt/SYMCOpsCenterServer/bin/dbbackup.sh /my_db_backup_dir
/opt/SYMCOpsCenterServer/bin/./dbdefrag.sh
Fourth step:
1- CREATED the file (missing) -- /usr/openv/var/global/emm.conf
Added contents--
NUM_DB_BROWSE_CONNECTIONS=20
NUM_DB_CONNECTIONS=21
NUM_ORB_THREADS=35
2- CREATED the file (missing)-- /usr/openv/var/global/nbrb.conf
Added contents--
SECONDS_FOR_EVAL_LOOP_RELEASE = 180
RESPECT_REQUEST_PRIORITY = 0
DO_INTERMITTENT_UNLOADS = 1
# /usr/bin/ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited
ls -l /opt/SYMCOpsCenterServer/db/data/
-rw------- 1 root root 181116928 Jan 18 09:58 symcOpscache.db
-rw------- 1 root root 139264 Jan 13 10:53 symcopsscratchdb.db
-rw------- 1 root root 1089536 Jan 13 10:53 symcsearchdb.db
drwxr-xr-x 2 root root 512 Jan 13 10:19 temp
-rw------- 1 root root 1048797184 Jan 18 10:19 vxpmdb.db
-rw-r--r-- 1 root root 589824 Jan 18 10:24 vxpmdb.log
Find attached outputs from following command while the issue happens:
/bin/vmstat 30 1
/bin/prstat -a 5 2
/bin/iostat -xntcz
/bin/ps -e -o pcpu,pid,ppid,args | /bin/sort -rn | /bin/head -50
/opt/openv/netbackup/bin/bpps -x
At last, here are some informationtaken from the netbackup master server:
cat /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
128
cat /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
128
ndd -get /dev/tcp tcp_max_buf
16777216
ndd -get /dev/tcp tcp_xmit_hiwat
2097152
# ndd -get /dev/tcp tcp_recv_hiwat
2097152
# ndd -get /dev/tcp tcp_wscale_always
1
Here below from bptm.log
04:00:03.399 [7398] <2> read_config_file: using 262144 value from /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK
04:00:03.399 [7398] <2> io_init: using 262144 data buffer size
04:00:03.399 [7398] <2> set_job_details: Tfile (718113): LOG 1452826803 4 bptm 7398 using 262144 data buffer size
.........
04:00:03.399 [7398] <2> read_config_file: using 0 value from /usr/openv/netbackup/NET_BUFFER_SZ
04:00:03.399 [7398] <2> io_set_recvbuf: NOT doing setsockopt() to set network buffer size
04:00:03.399 [7398] <2> io_set_recvbuf: receive network buffer is 2102260 bytes
04:00:03.405 [7398] <2> read_config_file: using 128 value from /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS
04:00:03.417 [7398] <2> read_config_file: using 128 value from /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK
04:00:03.417 [7398] <2> io_init: using 128 data buffers
04:00:03.417 [7398] <2> set_job_details: Tfile (718113): LOG 1452826803 4 bptm 7398 using 128 data buffers
..........
04:00:08.088 [7438] <2> bptm: INITIATING (VERBOSE = 5): -w -c masterserverBZ -dpath master_su -stunit dd7200_master_su -cl sdp1a -bt 1452826806 -b masterserverBZ_1452826806 -st 0 -cj 1 -reqid -1452
605272 -jm -brm -hostname masterserverBZ -ru root -rclnt masterserverBZ -rclnthostname masterserverBZ -rl 0 -rp 604800 -sl sdp1a -ct 0 -maxfrag 524288 -eari 0 -mediasvr masterserverBZ -no_callback
-connect_options 0x01010100 -jobid 718114 -jobgrpid 718114 -masterversion 750000 -bpbrm_shm_id 1174405198 -blks_per_buffer 512 -shm
04:00:08.089 [7438] <2> main: bptm.c.1601: maximum fragment size is 524288000 Kbytes
........
04:00:03.417 [7398] <2> io_init: child delay = 10, parent delay = 15 (milliseconds)
Thanks for your help.
Hi,
You didn't mention your server specs.
You're running OpsCenter on your Master server, you should migrate to a dedicated server if you're facing performance issues.
You're also running verbose logging, I don't figures on the impact on performance but one can assume having minimal logging will use less CPU than verbose logging, or atleast less memory.