Forum Discussion

Andy_Welburn's avatar
16 years ago

Suspended jobs restart automatically?

Simple, stupid question just to keep you on your toes:

 

Any reason why a job suspended thru' the Activity Monitor should then literally fail after an undetermined period of time (with Status 157) & restart?

 

Job initially just sits there as expected in a suspended state. It then fails (big red x) & restarts as a different job id & commences backing up!!!!

 

Obviously, if I suspend a job I've done it for a reason & therefore would not expect, nor want, NB to suddenly think "OOhh, I know, why don't I start this job again, he must've forgotten all about it :smileytongue: "

 

Can't find any timeouts anywhere for length of time job is suspended - maybe it's under the 'lets annoy the operator tab' :smileyvery-happy:

 

NB 6.5.1

Solaris 9 Master/Media

 

 

  • Well that's confirmed it! (But not sure if it's a FEATURE or a BUG)

     


    Andy Welburn wrote:

    All I can put it down to is the "Clean Up" option "Move backup job from incomplete state to done state". This was set to the default 3 hours and the suspended job 'failed' after 3hours 1minute & 0seconds.

     

    Anyway, have changed this setting to 6 hours so we'll see next time (or sooner if that setting causes me more problems in the meantime).

     

     


     

    A job was suspended this morning & 6 hours later it 'failed' & restarted automatically !!!!

     

    (Actually I suspended a child process in a multi-streamed job - for info yesterday it wasn't a multi-streamed job. The other children & parent processes continued running. When all the other child processes finished the parent process then went into a suspended state & then nothing. I presume at this stage I would've had to have waited a further 6 hours to elapse before anything happened. Being impatient, I cancelled the suspended parent process at which point new parent & child processes started.)

     

    So BEWARE - if you suspend a job, check the time in the Master servers "Clean Up" option "Move backup job from incomplete state to done state" as your job will restart after this period has elapsed.

     

5 Replies

  • are all the resources available? has drives? have you try using 6.5.2.a? also this sounds like a bug you can check with symantec also.
  • Thanks for the response Omar.

     

    It was more a case of why a suspended job should suddenly 'decide' that it didn't want to be suspended anymore. Resources were available I just didn't want it to use them at that time!

     

    As far as 6.5.2a is concerned - I think I might wait for 6.5.3 or 6.5.4 (I may have lost the responsibility for NB by then :smileywink: )

  • Just as an update - we had this happen again where I had to suspend a backup (a colleague wanted to copy the contents of the directory that was being backed up & to prevent disk I/O contention I suspended the backup) & several hours later the job went from 'Suspended' to 'Failed' (157) & then restarted as a different job id.

     

    I had to suspend this new job for a few more hours before it could be resumed & also investigated what setting could be determining this action.

     

    All I can put it down to is the "Clean Up" option "Move backup job from incomplete state to done state". This was set to the default 3 hours and the suspended job 'failed' after 3hours 1minute & 0seconds. Why a suspended job should be deemed to be incomplete is beyond me (I know it IS literally, but you know what I mean ...) it should be a separate entity. It's like cancelleing a job & then NetBackup thinking - "Hang on, I haven't done that yet, let's kick it off again" (oh yeah, it does that as well!!)

     

    Anyway, have changed this setting to 6 hours so we'll see next time (or sooner if that setting causes me more problems in the meantime).

     

     

  • Well that's confirmed it! (But not sure if it's a FEATURE or a BUG)

     


    Andy Welburn wrote:

    All I can put it down to is the "Clean Up" option "Move backup job from incomplete state to done state". This was set to the default 3 hours and the suspended job 'failed' after 3hours 1minute & 0seconds.

     

    Anyway, have changed this setting to 6 hours so we'll see next time (or sooner if that setting causes me more problems in the meantime).

     

     


     

    A job was suspended this morning & 6 hours later it 'failed' & restarted automatically !!!!

     

    (Actually I suspended a child process in a multi-streamed job - for info yesterday it wasn't a multi-streamed job. The other children & parent processes continued running. When all the other child processes finished the parent process then went into a suspended state & then nothing. I presume at this stage I would've had to have waited a further 6 hours to elapse before anything happened. Being impatient, I cancelled the suspended parent process at which point new parent & child processes started.)

     

    So BEWARE - if you suspend a job, check the time in the Master servers "Clean Up" option "Move backup job from incomplete state to done state" as your job will restart after this period has elapsed.