Master: 188.8.131.52 running on W2k8 R2
Media Server: 5230 running 2.5.6
Clients running 184.108.40.206
We have three AIX LPARS running on the same physical chasis. I can back up their filesystems over Fibre Transport just fine unless I'm also (not simoltaniously on the same box) running a DB2 or Informix backup (either all three or two of the three different combinations at once).
Then one of the two or three things running (again, one per LPAR) will throw an out of sequence error and bomb out.
Filesystem alone: works
Filesystem on one box + DB2 on another: error
Filesystem on one box + Informix on another: error
DB2 on one box + Informix on another: error
If I change the DB2 box and the Informix box over to "don't use fibre even if it's there" then everything works great. Slower but great.
Per our Fibre guy the overall traffic (NetBackup plus what the boxes use reading/writing the SAN) isn't even hitting 2 GB out of a possible 8 GB link. And the AIX folks say all the drivers (LPAR and host) were updated last December.
I've played around with shrinking the buffer size while increasing the number of buffers (divide one by 2 and multiply the other by 2, that sorta thing) with no apriciable change...
So do I just give up and declare Fibre Transport a lost cause when a database is involved? Or has someone managed to get this kinda thing working?
Please reply with the detailed status of that job. It will have some details that will help.
When you say out of sequence, are you getting 174 errors? TECH147561 talks about this error and sugests its caused due to defective initiator drivers or overabundant commands from client utilities or device monitoring programs.
Yes, I've been through that one to be limit of my non-Admin rights. And when questioned, the Admins say that everything at their end is up to date and is just fine.
I also occasionally get this error (but not as often) http://www.symantec.com/business/support/index?page=content&id=TECH71062 which is what lead me to try buffers.
I don't have the detailed status available since I clean up my GUI every morning after addressing any >1 error codes. But I pulled the entries for one job from "all log entries" and I've attached it. This was a filesystem backup that failed when I turned on fibre for the Informix backup that was running at the same time. I finally turned fibre back off, waited until the Informix threads had completed (and the new ones were using the network) before I restarted this for the last time. It finally worked.
It's worth a shot. I was under the impression that most of the .4 -> .7 changes were in the Windows clients (I love the fixes for DFS!) but double checking those release notes does show a lot of fibre related fixes.
I'll get the change request paperwork started and report back once that's done (in a few weeks)...