cancel
Showing results for 
Search instead for 
Did you mean: 

PDDO Replication job fails,aborted by watchdog

jploui
Level 4
Partner Accredited

Hi,

 

Im currently running PureDisk version 6.6.1.2.

I am replicating data from one linux server to another locally and im getting the following error.

I have applied eeb20-rollup2

 

This is my error message.

*** Supportability Summary ***

jobid = 12915

jobstepid = 47212

agentid = 512000000

hostname =

starttimejobstep = March 28, 2012, 2:26 pm

endtimejobstep = March 28, 2012, 2:27 pm

workflowstepname = Prepare Replication

status = SUCCESS

Job 12915: Aborted by Watchdog window exceeded

Retry Kill

 

Agent Jobstep analysis: exitcode 15, status 3, progress 0.

*** Supportability Summary ***

jobid = 12915

jobstepid = 47213

agentid = 512000000

hostname = 

starttimejobstep = March 28, 2012, 2:27 pm

endtimejobstep = March 29, 2012, 6:03 am

workflowstepname = Forward Data

status = ABORTED_BY_WATCHDOG

Execute WFAction: Mark Error

 

Anyone seen this before or thats familiar with this,

Can I increase the watchdog timeouts? enable watchdog logging,disable watchdog.

Im replicating 1.68TB of data.

 

Thanx

1 ACCEPTED SOLUTION

Accepted Solutions

jploui
Level 4
Partner Accredited

Hi,

 

We did change the timeout to 1 month before I posted this, and it still failed after 15 hours.

Ended up in logging a call, seemed that a EEB that we applied earlier didnt update binary files etc. Engineer took care of it.

 

Problem solved.

 

 

View solution in original post

2 REPLIES 2

S_Williamson
Level 6

Hi

You set the watchdog timeout in the policy you are executing. Default for most is 5 days I believe but you can set to what ever you like. If this is a first pass of the replication I'd set it to 1 month and then after that set it back to something more sensible.

Just go to Manage -> Policys, expand Replication, select the policy you are running

On the first page you should see "Escalate error and terminate after : " which is probably set to 5 days. Change this value to something large (1 month) which should give you plenty of time to get the first pass done. After that set it back to 5 days or whatever you like.

Simon

jploui
Level 4
Partner Accredited

Hi,

 

We did change the timeout to 1 month before I posted this, and it still failed after 15 hours.

Ended up in logging a call, seemed that a EEB that we applied earlier didnt update binary files etc. Engineer took care of it.

 

Problem solved.