This has happened in the environment where I work in in the past. I believe our issue was , with so much data, the job was timing out while reading the archive bits.. therefore the full runs fine everytime.
I believe in these cases we tried streaming the job by drive, to determine if the issue lay someplace in particular. .. We also tried upping the client read timeouts, communication buffer size, and
VSP Busy file timeout in client properties...
Also a fix we implemented occasionally...
dropped connections due to a timeout, which can occur quite often under congested networks. It is possible to increase the likelihood of keeping client connections alive in the event of temporary congestion by increasing the value of the TcpMaxDataRetransmissions parameter in the server registry.
1. Start Registry Editor (Regedt32.exe).
2. Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3. On the Edit menu, click Add Value , and then add the following registry value:
Value Name: TcpMaxDataRetransmissions
Data Type: REG_DWORD
Value: 6
4. Quit Registry Editor and reboot for the changes to take effect.
The default value for this is 5 and should gradually be increased to 10.
Well, hopefully something here may help or lead you in the right direction. Good luck