Jobs fail to run with no errors logged

One of my customers is backing up 12 servers (11 remote + media server), on a D2D2T so we expect to see 12 B2D and 12 dups (B2T) every night, diffs on Mon-Thu and fulls on Friday. All jobs are created from policy and template.
Trouble is that quite often one or two of the jobs just don't run, it's not that they run and leave an error in the Job Monitor/Job History list, they are still sitting in the Current jobs section with a 'start time' of last night. If I ask it to 'run now' it runs fine (if it's a dup I tell it which bkf files to back up first of course).
The same server will backup on a schedule the next night fine, as it did the night before the failure.
The remote servers are a mix of W2K and W2K3 and it has happened on both.
No services have stopped, nothing else failed at the same time.
Debug logging is on (see article 250681) at the media server and (if it's a dup that failes) in the 'bengine' log I can see the B2D job run and then there is no mention of the server trying to do B2T. In contrast, for the servers that backup OK I can see the D2T job kicked off just after the B2D job finishes. However, to me (suspect I am not alone!) the debug files are pretty impenetrable so there might be some issue in there that I am not picking up because it doesn't use a word like 'error' or 'failure'!
I have recently completely reinstalled BE11d 7170 and it made no difference, they have SP2 and hfxs31-35. This is not the only site I have seen this at but it's then one I am concentrating on at the mo.
Thanks for reading to the end - all suggestions welcome!