Forum Discussion

psvines's avatar
psvines
Level 3
9 years ago

IBM AIX PowerVM and Appliance

 

We are considering replacing our tape library with an Appliance, 5230.

We have multiple IBM Power servers (P770, E870) with 100+ AIX clients. Each currently has a media server which is SAN attached to the tape library. The AIX clients on the same Power server connect to the media server via the internal hypervisor network. We have multiple Oracle database (3Tb) environments.

One of the existing media servers is also the master server.

We're not clear on how the new configuration would work, and sales is struggling to clarify.

Do we keep the media servers and they would be SAN attached to the 5230?

Is the SAN client or Fiber Transport feature needed? Neither are in use now.

We are concerned about sharing the existing virtual fiber network with both data I/O and backup I/O. The SAN client best practices guide says not to do this.

We would like to discuss this with a customer that is currently using an Appliance on IBM Power running AIX. I already have plenty of "it should work", "sounds reasonable", etc... from sales, but they are unable to locate a solid reference.

 

Thanks in advance for any assistance.

 

  • First Yes I am using an Appliance to backup Oracle from AIX servers.    That said the sizing for this is not really any different than backing up Linux servers or any other unix.  So your sales team should be able to help you size it.

    Here a few things to consider:

     

    * Yes Fiber Transport does work.  Although honestly if you have a good 10G network ( or even 1G) you probably dont need it.

    * I would not share Fiber Transport on the same media with Disk IO.

    * I would bet your transport will not be the issue.  Focus on how fast the appliance can ingest the data and how many streams will be running.  The appliances are great until you overload one.

    * Also pay attention to replication, or tape out sizing.  Both can be challanging.

     

    Personally,  I try to keep things simple.   Until I found it would not work.   By that I mean install the client and use the network to transport the data to the appliance.

     

    And as other have said here.. make the transition slowly.

     

     

  • FYI - I hope to review this later this evening (GMT time), unless someone else beats me to it before then.

  • I'm hoping to find an actual user of an appliance on an IBM Power/AIX system, to avoid excessive opinions and theories. It's been frustrating that our Veritas sales can't find anyone.

    Responses below:

    1) What is the range of AIX versions on the physical P770 and E870 servers?  (i.e. lowest in use, to highest in use).

    NBU master and media - AIX 6.1

    All other LPARs - AIX 7.1

    2) Are the NetBackup Media Server instances (and single NetBackup Master Server instance) installed at the physical server AIX hypervisor layer?  Or are they also VMs?

    Ann LPARs are completely virtual, except NBU which has dedicated HBAs.

    3a) Are the NetBackup "Server" instances (all media servers, and master) the same version of NetBackup?

    Yes, 7.1.0.4   (the highest suported version for the ovpass driver)

    3b) If yes, which version of NetBackup ?

    7.1.0.4

    3c) If no, then what is the range of NetBackup Server versions installed?

    4) Is the tape library shared across all NetBackup Server instances?

    Yes, Quantum i6000

    5) How many P770 and E870 servers are there?

    P770 (2), P770+ (1), E870 (1)

    6) How many tape drives are there in the tape library?

    Dozens, Our partition has 8 IBM LTO5 drives.

    7a) Are all tape drives visible to all NetBackup Media Servers?

    Yes

    7b) If yes, then how many SSO tape drive paths are there?

    All 8 drives are visible to all NBU LPARs, although no multi pathing configured.

    7c) If no, then please try to describe the tape drive presentation topology/scheme?

    8) Near the beginning of the original post, you say "Each currently has a media server which is SAN attached to the tape library", but then later in the post you say "We are concerned about sharing the existing virtual fiber network with both data I/O and backup I/O"... so, what types of SAN connectivity are in use and where ?  i.e. where is the "physical SAN" layer in use, and where is the "virtual SAN" layer in use?

    NBU has physical dedicated HBAs for tape access. All remaining LPARs are fully virtual via VIOs and NPIV, i.e. SAN physically attached to VIOs.

    We've been told we wouldn't have media servers on each frame with an applicance, i.e. the appliance would be the only media server. This would mean all backup traffic would travel on the same virtual fiber as the existing data traffic, which is frowned upon and mentioned in the SAN client best practices.

    If the existing media servers can remain in use, this issue woulnd't exist. Can the existing media servers continue to be used to move the backup traffic from the clients to the appliance?

    9) Make, model and firmware version of HBAs in the P770 and E870 servers?

    Part Number.................00FX604
    EC Level....................N46430
    Feature Code/Marketing ID...EN0Y

     

     

  • I suspect that if we had a bit more detail around versions and where the instances of NetBackup "Server" are installed, and topology then someone else might be able to formulate a more sensible answer for you.

    .

    1) What is the range of AIX versions on the physical P770 and E870 servers?  (i.e. lowest in use, to highest in use).

    2) Are the NetBackup Media Server instances (and single NetBackup Master Server instance) installed at the physical server AIX hypervisor layer?  Or are they also VMs?

    3a) Are the NetBackup "Server" instances (all media servers, and master) the same version of NetBackup?

    3b) If yes, which version of NetBackup ?

    3c) If no, then what is the range of NetBackup Server versions installed?

    4) Is the tape library shared across all NetBackup Server instances?

    5) How many P770 and E870 servers are there?

    6) How many tape drives are there in the tape library?

    7a) Are all tape drives visible to all NetBackup Media Servers?

    7b) If yes, then how many SSO tape drive paths are there?

    7c) If no, then please try to describe the tape drive presentation topology/scheme?

    8) Near the beginning of the original post, you say "Each currently has a media server which is SAN attached to the tape library", but then later in the post you say "We are concerned about sharing the existing virtual fiber network with both data I/O and backup I/O"... so, what types of SAN connectivity are in use and where ?  i.e. where is the "physical SAN" layer in use, and where is the "virtual SAN" layer in use?

    9) Make, model and firmware version of HBAs in the P770 and E870 servers?

    .

    I think that if we had answers to the above, then that might help some of us generate perhaps one more round of discovery questions, and then after that next round of discovery questions, then one or more of us might be able to comment on design considerations.

    Apologies if the previous posts from me were a bit wooly, but, IMO, there wasn't enough detail to work with, so I ended up generalising with what I hoped might be some useful angles of thought.

     

  • Thanks for the great technical assistance Marianne. 

     

    I first posted in the Netbackup forum and later realized the topic was more appropriate for the appliance forum, hence the second posting. None of the responses are duplicates nor answered my true question. 

    I'm still searching for a user of the appliance that is also using a similar IBM environment. 

  • My advice, do not jump in without scaling and sizing.  Veritas Sales should have a sizing calculator that they keep (you/we cannot have) and so you work with them on scaling and sizing.

    Be very careful about scaling/sizing the Oracle data.  If the Oracle backups are compressed you will get zero deduplication ratio for that backup data, here's an example:

    e.g:  your 3 TB Oracle backup, full every day, compressed = ( 3TB x weekly full Saturday x 4 week retention) + ( 3TB x daily full Sunday to Friday x 2 week retention ) = 12 TB + 36 TB = 48 TB    ...of... apliance back-end disk space... because the Oracle backups/dumps are compressed.   N.B:  This is not a "Veritas Appliance" issue.  This is true for all "de-dupe storage", no matter who the vendor is.  The "compressed" backup data is effectively forever randomly unique and does not compress again, and so consumes the same size on the back-end appliance disk storage (inside the MSDP pool inside the appliance) as it does on the front-end (the source of the backup data). 

    SAN FT is very specific to only a few very specific configurations and Operating System types and Operating System versions - plan very carefully.

    Your existing SAN switches, Brocade or Cisco?  If Brocade you should be ok if your SAN switches are single ASIC Kondor... but research the term "cut through routing", i.e. if you place your "initiator AND target" on the same ASIC on the same switch then Brocade will auto employ cut-through routing whereby the FC frame head from the intiator begins to be routed to the initiator even before the tail of the FC frame is received ... i.e. nano-second latency, and no chance of buffer-to-buffer credit exhaustion.   If your Brocade SAN switches are multi-ASIC then careful allocation/cabling of initiator and target within the same ASIC port group will still achieve cut-through routing.  Whereas all Cisco FC SAN switches are "store and forward" FC frame routing. i.e. the whole FC frame has to be received, checksumed, and then forwarded, all of which occurs using buffers.  I've worked on very large SANs, and believe me, if you have Brocade SAN switches *AND* you get your thinking and design and cabling done well - then you can have your entire backup traffic in "cut through routing mode" to completely and toally eradicate any concept or risk of the SAN ever being a potential bottleneck to your backup traffic - AND EVEN BETTER IS - no possibility or risk of the backup traffic ever being the cause of any SAN or IO performance issues for any other traffic or servers or storage services.

    .

    You can help yourself, by spending say 30 minutes googling the term "data producer" and "data consumer"... what you read will help you think about where your data is originating and where you need to get it to... when you then add the "volumes/quantities" of backup data to your "producer versus consumer map"... then this will greatly help you select the most appropriate hardware with your budget.

    E.g.  in a typical backup to tape:   the storage (where your LUNs are that have the databases on them) is the data producer (via SAN) AND (via SAN) the backup client is the data consumer AND the backup client is the data producer (via LAN) AND (via LAN) the media server is the data consumer AND the media server is the data producer (via SAN) AND (via SAN) the tape drive is the data consumer.

  • Your environment sounds like one of those that has been carefully crafted over fair period of time.

    You need to spend some time scaling and sizing.

    There's absolutely nothing wrong with the Veritas Appliances.  They're great general work horses.  They can do a lot.

    But... just because they are good and quite performant (with typical workloads)... this does not automatically dictate that they will be 100% suitable for all types of backup data in your environment.  Remember, you don't have to de-dupe everything.  You could easily partition (and it is really easy to partition the Veritas Appliances - on the fly - at *any* time) into Advanced Disk and MSDP Disk, and use the different types of backup target disk storage for different purposes... e.g.:

    - partiton say 12TB of Advanced Disk for 3 day retention of the 3TB daily full backups from Oracle (and daily duplicate to tape) - very fast, very nice.

    - partition the rest of the disk for MSDP Disk for 4 week retention of the other backup data which does typically de-dupe very well.

    .

    *IF*, when scaling and sizing, it becomes clear that "de-dupe" just does not suit your environment, then *any* vendor's dedupe is not going to help you - and you'll need to look at some fairly fast disk to act as Advanced Disk, or perhaps even a VTL, or just keep going direct to tape.

    .

    Re-establish your RPO and RTO - these are your true guides!