05-05-2014 10:51 AM
Master: 7.5.0.7 running on W2k8 R2
Media Server: 5230 running 2.5.6
Clients running 7.5.0.4
We have three AIX LPARS running on the same physical chasis. I can back up their filesystems over Fibre Transport just fine unless I'm also (not simoltaniously on the same box) running a DB2 or Informix backup (either all three or two of the three different combinations at once).
Then one of the two or three things running (again, one per LPAR) will throw an out of sequence error and bomb out.
So:
Filesystem alone: works
Filesystem on one box + DB2 on another: error
Filesystem on one box + Informix on another: error
DB2 on one box + Informix on another: error
If I change the DB2 box and the Informix box over to "don't use fibre even if it's there" then everything works great. Slower but great.
Per our Fibre guy the overall traffic (NetBackup plus what the boxes use reading/writing the SAN) isn't even hitting 2 GB out of a possible 8 GB link. And the AIX folks say all the drivers (LPAR and host) were updated last December.
I've played around with shrinking the buffer size while increasing the number of buffers (divide one by 2 and multiply the other by 2, that sorta thing) with no apriciable change...
So do I just give up and declare Fibre Transport a lost cause when a database is involved? Or has someone managed to get this kinda thing working?
05-05-2014 02:28 PM
Hello,
Please reply with the detailed status of that job. It will have some details that will help.
When you say out of sequence, are you getting 174 errors? TECH147561 talks about this error and sugests its caused due to defective initiator drivers or overabundant commands from client utilities or device monitoring programs.
05-05-2014 02:51 PM
Yes, I've been through that one to be limit of my non-Admin rights. And when questioned, the Admins say that everything at their end is up to date and is just fine.
I also occasionally get this error (but not as often) http://www.symantec.com/business/support/index?page=content&id=TECH71062 which is what lead me to try buffers.
I don't have the detailed status available since I clean up my GUI every morning after addressing any >1 error codes. But I pulled the entries for one job from "all log entries" and I've attached it. This was a filesystem backup that failed when I turned on fibre for the Informix backup that was running at the same time. I finally turned fibre back off, waited until the Informix threads had completed (and the new ones were using the network) before I restarted this for the last time. It finally worked.
05-06-2014 03:00 AM
Well worth getting your clients up to 7.5.0.7 to match the media / master servers
There are lots of FT fixes in the 7.5.0.7 bundle which also turns on the FTCLIENT_VALIDATE_VIA_CHECK_CONDITION by default
05-06-2014 08:53 AM
It's worth a shot. I was under the impression that most of the .4 -> .7 changes were in the Windows clients (I love the fixes for DFS!) but double checking those release notes does show a lot of fibre related fixes.
I'll get the change request paperwork started and report back once that's done (in a few weeks)...