Hi,
Running NBU 6.5.2A on Solaris Master and noticed nbjm was utilising over 40% of CPU constantly. ran vxlogview for nbjm for the past 24 hours and noticed the following entries repeated from 06:47 onwards
11/18/09 13:26:16.543 [Debug] NB 51216 nbjm 117 PID:12294 TID:8 File ID:117 [No context] 1 [JobAttr:FindJobFile] (fb67dfc8) cannot find param file for jobid=3339913(JobParam.cpp:540)
11/18/09 13:26:16.549 [Debug] NB 51216 nbjm 117 PID:12294 TID:1 File ID:117 [No context] 1 [JobManager_i::getJobById] (968098) job not found in map, jobid=3339913(JobManager_i.cpp:791)
....so looked for 3339913 in activity monitor but it was not there, problems report showed this job had failed at 02:05 with 86, ran vxlogview on the jobid and it showed entries up until 02:05
11/18/09 02:05:32.901 nbpem PID:12445 File ID:116 (ID:1774820) Active subtask count=0(PemTask.cpp:528)
11/18/09 02:05:32.910 nbpem PID:12445 File ID:116 [Info] CLIENT *************** POLICY BIB-S1-eTrust-7_Year_Logs SCHED Daily_Dinc EXIT STATUS 86 (media position error)
11/18/09 02:05:32.921 nbpem PID:12445 File ID:116 [Error] backup of client *************** exited with status 86 (media position error)
11/18/09 02:05:50.553 nbrb PID:11659 File ID:118 received release of mediaId=011449, driveName=S1SL85-16-9940B, STU=S1MED1-S1-9940B
11/18/09 02:05:50.845 nbrb PID:11659 File ID:118 received release of mediaId=020428, driveName=S2SL85-13-9940B, STU=S1MED1-S2-9940B
11/18/09 02:05:50.972 mds PID:11538 File ID:111 [Info] Drive S2SL85-13-9940B cannot be made available, it has pending actions: 1
looking back at the nbjm log it showed the following for the job id
11/18/09 02:05:32.831 [Debug] NB 51216 nbjm 117 PID:12294 TID:4 File ID:117 [No context] 1 [CallbackQueue::queueRequest] queueing BPJobdExpireJob jobid=3339913, secondary jobid count=2 -- retry count=0(CallbackQueue.cpp:1268)
11/18/09 02:05:32.832 [Debug] NB 51216 nbjm 117 PID:12294 TID:4 File ID:117 [No context] 1 [JobManager_i::doForgottenJobCleanup] (968098) job has been forgotten, perform cleanup, jobid=3339913(JobManager_i.cpp:2026)
11/18/09 02:05:32.898 [Debug] NB 51216 nbjm 117 PID:12294 TID:4 File ID:117 [No context] 1 [JobManager_i::deleteFrozenImage] (968098) frozen image delete failed, no snapid for jobid=3339913(JobManager_i.cpp:1961)
11/18/09 02:05:33.177 [Debug] NB 51216 nbjm 117 PID:12294 TID:5 File ID:117 [No context] 1 [CallbackQueue::handle_input] sending BPJobdExpireJob jobid=3339913, secondary jobid count=2(CallbackQueue.cpp:1391)
11/18/09 02:05:41.841 [Debug] NB 51216 nbjm 117 PID:12294 TID:1 File ID:117 [No context] 1 [JobMapper::startDelayedJob] (75ef58) job not found in the delayed job map, jobid=3339913(JobMapper.cpp:1026)
...........................................................
11/18/09 06:47:53.681 [Debug] NB 51216 nbjm 117 PID:12294 TID:7 File ID:117 [No context] 1 [CallbackQueue::queueRequest] queueing PEMretryJob : jobid=3339913, parentid=3339913, retryType=2 -- retry count=-1(CallbackQueue.cpp:1268)
11/18/09 06:47:58.031 [Debug] NB 51216 nbjm 117 PID:12294 TID:1 File ID:117 [No context] 1 [CallbackQueue::handle_input] sending PEMretryJob : jobid=3339913(CallbackQueue.cpp:1391)
11/18/09 06:47:58.038 [Debug] NB 51216 nbjm 117 PID:12294 TID:4 File ID:117 [No context] 1 [JobManager_i::getJobById] (968098) job not found in map, jobid=3339913(JobManager_i.cpp:791)
Can anyone advise what has happened here, restarting the nbu services has dropped the nbjm cpu utilisation to a normal level.
TIA