cancel
Showing results for 
Search instead for 
Did you mean: 

EMM failed all of a sudden

Anton_Panyushki
Level 6
Certified

Hello,

Master: NBU 6.0 MP6 running Solaris 9

The problem is that EMM daemon failed and then started few times during last weekend, whereas NBDB was up and running.

 

There are a number of status codes 252 in Activity Monitor and a messages in job status says that  

 

NBEMM returned an extended error status: invalid error number (3000001)

 

I failed finding this msg in both Internet and  Troubleshooting Guide.

 

 

I think it would be nice to have at least *.h files with error codes in NBU distro.

Message Edited by Anton Panyushkin on 09-01-2008 06:51 AM
13 REPLIES 13

dami
Level 5

For starters EMM will log to the unified logs /usr/openv/logs (for unix) and can only be viewed with the vxlogview commands.

 

See for example: http://seer.entsupport.symantec.com/docs/280029.htm

 

Where commands such as vxlogview -i 111 -t 24:00:00 are described.

 

You may also need additional verbosity in these logs (the default I think is 1 or 0)

 

To see the current setting:

/usr/openv/netbackup/bin/vxlogcfg -p 51216 -o Default -l

To set to 6 (of course you will need to wait till the error happens again to see additional detail):

/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o ALL -s DebugLevel=6 -s DiagnosticLevel=6

back to 0:

/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o ALL -s DebugLevel=0 -s DiagnosticLevel=0

 

 

 

It's been my experience that these logs are fairly hard toi interpret and that Symantec (with their log searching tools) often can get to a root cause faster. I seem to recall loads of engineering binaries at 6.0 MP6.

sdo
Moderator
Moderator
Partner    VIP    Certified
EMM daemon will stop if free disk space falls below 1%.

Anton_Panyushki
Level 6
Certified

Hello,

 

I have checked emm log and it says that emm first shut sown in a normal manner and then was brought online. I know that EMM was started by a fellow from duty admins team, but why EMM was shut down is a total puzzle for me.

Concerning free space, I checked df output and should say that each FS has a plenty of free blocks.

dami
Level 5

Maybe increase the UNIFIED logging to verbose 5 or 6 (I think they are the same anyway) and leave it to run over the weekend.

 

NOTE though that these logs at this verbosity require a HUGE amount of space (especially if you have a busy NetBackup server). If you have to the order of hundreds of free GB on your /usr/openv/ partition (these write to /usr/openv/logs) then fine, otherwise it could cause you even more headaches.

Omar_Villa
Level 6
Employee

stop netbackup

stop PBX

clean schedules

check ipc jobs and clean if are hang

start PBX

start netbackup

 

and let us know what it came up in the logs

 

regards

Anton_Panyushki
Level 6
Certified

 

I suppose that you imply removing rm /usr/openv/netbackup/db/jobs/pempersist by clean schedules.

But I wonder what do you mean by removing IPC jobs?

V_Dinesh
Level 3
Certified

hi ,

 

  Try this command.....

 

This will start the EMM database seperately...

 

Command : installpath\programfiles\veritas\netbackup\bin\nbdb_admin -start

 

 

Anton_Panyushki
Level 6
Certified
I have examined NBDB log and there were no any signs of DB going down. The problem is in emm daemon itself.

sdo
Moderator
Moderator
Partner    VIP    Certified

Anton, try Dami's suggestion of analyzing the VxUL logs again.

 

Here's a quick reference card to help with the parameters:

ftp://exftpp.symantec.com/pub/support/products/NetBackup_Enterprise_Server/287647.pdf

 

Anton_Panyushki
Level 6
Certified

EMM has not expirienced any faults since then, so I suppose I can close this thread.

Thank you very much for your answers.

Status_Code_0
Level 4
Partner Accredited Certified

I realize that this thread was closed, but I have the same issue with EMM doing down with an invalid number?  Please let me know if you have any updates?


Thanks.


James

Manoj_Siricilla
Level 4
Certified

Here are the steps that you need to do.

 

cat /usr/openv/db/log/server.log 

 

and verify if you see any entries like this

I. 09/22 12:14:07. Database "NBDB" (NBDB.db) stopped at Mon Sep 22 2008 12:14
I. 09/22 12:14:07. Database "utility_db" (utility_db) stopped at Mon Sep 22 2008 12:14

These above lines indicate that the database has been stopped gracefully and if not examine the last 100 or 200 lines as to why and how the database has stopped.

 

Every time a checkpoint is taken it is updated to this log.

 

----- Next....

 

Run this command

 

/usr/openv/netbackup/bin/vxlogview -o nbemm -E -L -n 1 > nbemn_error.log

 

The above command dumps only the errors log generated for nbemm in the last 1 day.


Examine those logs and it should give you an idea on failures

 

also check the disk space where the emm is mounted on... if emm runs out of disk space it will log a critical message and not an error message and you can verify that as follows

 

/usr/openv/netbackup/bin/vxlogview -o nbemm -C -L -n 1 > nbemm_critc.log

 

Let me know the outcome, I will try and get you some more info.

 

 

 

Status_Code_0
Level 4
Partner Accredited Certified

What if I am attempting the view the log file on a different server?  I took the unified log off the master and moved it to another NBU machine.  Is it the -K (hostname) switch?


Thanks.