cancel
Showing results for 
Search instead for 
Did you mean: 

PDDO Puredisk - Maximimum I/O Configuration

austin_lazanows
Level 4
Certified

Hi Everyone,

I am attempting to design as optimum of a data path I can with some constraints put into place. Here is the following information that I can provide you today:

PDDO configuration only. No puredisk clients. version 6.6.1.2. optimized duplication to remote site. 

6 high-end puredisk servers (36GB DDR3 1333Mhz ram + (2) 4-core 2.93 Ghz processors on local 73GB 15k sas drives for root), with 1 failover. two 8gb/s fiber channel paths to AMS2500 array. 32TB for /Storage/data, +10% for /Storage/databases, +5% for /Storage on each server. This makes a total of 192TB for storage, +28.8TB for database and other files under /Storage.  1 SPA, the rest are metabase and content routers. 10gb ethernet ports, with 10gb ethernet at the media servers as well. 

AMS2500 with 480 disk drives. Each disk is a 600GB SAS 15k. Raid 10+2 configuration (making 40 disk groups).  All disk groups in a single pool.  

Puredisk segment size: 128KB. This will not change.

This environment does not exist today, but will be ordered here soon, and just want to plan prior to setting up and going "oops!" later.

 

Based off of this information, does anyone have a recommendation on the following?:

 

AMS2500 raid stripe size.

Puredisk lun division & count. 

Puredisk vxvm stripe size.

Any adjustment to block size in any files? 

any adjustment to any other tuning settings on the kernel, application, etc.

 

Assuming that we have 6 other media servers (same performance design) doing PDDO fingerprinting, is there a recommended block size for the media servers to use? or stick with the standard recommendation of 1MB in SIZE_BUFFERS_DISK ? 

 

I've read the tuning guides, but some of this information seems to allude me, especially items such as stripe sizes. 

5 REPLIES 5

austin_lazanows
Level 4
Certified

Hi Everyone,

I am documenting this build design as I go along, as a hopeful guide for those in the future. Here are some of the specs designed into this so far:

Server Specifications

(7) half-height blade servers - Netbackup PureDisk Servers

OS:None Installed (PureDisk installation installs its own OS based on Suse Enterprise Linux 10)

Internal Disk:73GB 15k SAS, RAID 1 Configuration

Controller:Preference is for internal battery

CPU:(2) 4-core 3.20 Ghz Intel/AMD processors or better

Memory:36 GB DDR3 RAM 1333Mhz

Networking:

(2) 100Mb/s or better private network heartbeat connections

(2) 10Gb/s copper Ethernet, for load-balancing and failover

(2) 8Gb/s fiber channel ports. To be used for multi-pathing to backup SAN storage.

(1) iDRAC Connection with virtual console (or similar device)

 

(6) half-height blade servers - Netbackup Media Servers

OS:(5) SUSE Enterprise Linux 11 SP1, (1) Windows Server 2008 R2

Internal Disk:73GB 15k SAS, RAID 1 Configuration

Controller:Preference is for internal battery

CPU:(2) 4-core 3.20 Ghz Intel/AMD processors or better

Memory:8 GB DDR3 RAM 1333Mhz or better

Networking:

(2) 10Gb/s copper Ethernet, for load-balancing and failover

(2) 8Gb/s fiber channel ports. To be used for multi-pathing to external SAN storage and Tape Library.

(1) iDRAC Connection with virtual console (or similar device)

 

(2) half-height blade servers - Netbackup Master (clustered as failover)

OS: SUSE Enterprise Linux 11 SP1

Internal Disk:73GB 15k SAS, RAID 1 Configuration

Controller:Preference is for internal battery

CPU:(2) 4-core 3.20 Ghz Intel/AMD processors or better

Memory:8 GB DDR3 RAM 1333Mhz

Networking:

(2) 10Gb/s copper Ethernet, for load-balancing and failover

(2) 8Gb/s fiber channel ports. To be used for multi-pathing to external SAN storage and Tape Library.

(1) iDRAC Connection with virtual console (or similar device)

 

Note: The use of blade enclosures will reduce the amount of traffic & hops for completing the inter-communication between PureDisk Servers, Master Servers, and Media Servers. If multiple enclosures are needed for the design, a backplane link capable of handling the performance demands between the two enclosures should be in place to remove switch impact.

 

 

Storage Specifications

Based on the Server design, the limitation of speed (assuming that the 8Gb/s fiber channel is multi-pathed and load-balanced) falls on the overall network, end disk storage and CPU of the servers. The theoretical performance that the 6 PureDisk nodes could produce is 60Gb/s to be written to storage. Along with the backup target storage writes, there are also hash table databases for segment lookups (the deduplication engine) and for running queue processing jobs for compaction, deletion, etc. The faster the hash table databases and backup storage target, the better overall performance.

AMS2500, Fully Configured (used for /Storage, /Storage/data, and /Storage/databases):

Number of Disks: 480

Drive Type: 600GB SAS 15k

RAID Configuration: RAID 5 (6+1); hot spare pool available.

Disk Pool Configuration: all disk groups in a single pool

Disk Pool Useable Size: 224635.00

Fiber Channel: Multi-path enabled, load-balancing enabled, all ports utilized

Replication: Performed on the Application level, NOT the Hardware level.

Average I/O size: 64KB

Application I/O Block size: 64KB(for backup data segment), 8KB (for database entry)

Stripe Size: 64KB

 

Note: Optionally some of the systems could have their /Storage/databases portion migrated to the USPV if additional iops/performance is needed. The faster the access to the database, the faster cleanup & hash can be performed.

 

 

Production Network Only - Fully Redundant:

(16) 10g copper ports for Master/media servers -- 1 for network, 1 for failover

(4) 1g copper ports for Master Cluster heartbeat

(14) 1g copper ports for PureDisk servers heartbeat network

(14) 10g copper ports for PureDisk servers -- 1 for network, 1 for failover

(5) 10g copper ports for performance based clients.

(OPTIONAL) Jumbo frames set to 9k for all network connections. However, if a client (normal server) does not have jumbo frames set or its network port or switch port are incompatible with jumbo frames, then this will cause network failures even outside of Netbackup.

bartman10
Level 4

You can find my config in this post at the PureDisk section.

Also some good info about issues you may run into. I notice you are allocating a lot of space for the /Storage/database mount. I have a 92TB PD pool used only for PDDO and it never uses more than 150gb for the database. The mount point is 490gb just to make sure.. but the Symantec planning doc for this is way out of spec for what it actually uses.

Also your statement about the database affecting speeds is not quite true. If you watch the IO on the DB mount point you'll notice it almost never gets touched accept during the last 1/4 of a Cr-Queue processing job. That's the only time it makes changes to this database. After processes the T-logs and has new updates for the database. The rest of the time PureDisk uses Memcached to create a large, shared cache pool all the nodes share that caches the database. This is one of the reasons ram is so important.

BUT, when it does get to the last part of the CR processing it does put up some good IO numbers on the database mount. I see around 4-5k IOPS while it's working. Put it on a RAID10.

For my 92TB PD pool it has 3 servers each controlling 32TB of disk each and 64gb ram each. The Symantec deployment guide does not suggest enough RAM for the servers by the way.. The servers are all combo servers with one of them also running the SPA. I never see any stress on the CPU but also keep in mind PureDisk is not threaded well!!! Go for less cores with as high of mhz as you can get. I have 2x 8 core Opteron cpu's in each and most of the time half the cores don't even have threads to run at sit at 0%. I've heard future releases of PD will have better threaded, t-log processing drastically needs multi threading!

I'm converting my PD servers to 10g but they really don't need it at all. You'll be shocked how little data actually gets transferred to them.. especially considering you're planning on having 6 routers.

 

If you are looking for speed and have any concern about re-hydration speed out of PureDisk read my post above!!! You will WANT to get these early "tape out" EEB's from day ONE! If not you'll have to do what symantec calls "re-staging" to take advantage of the "data co-location" coming in PureDisk 6.6.3.

How many PD setups have you deployed or is this your first one? I'd love to talk PureDisk with you to help avoid the "WW3" battles I've had with Symantec about the poor documentation and crazy things the sales and support people will tell you about PureDisk.

austin_lazanows
Level 4
Certified

Hi There,

 

Yeah, i agree that they oversize the database portion. However, I am trying to prevent one of those "We won't support you because the configuration is wrong" ideas. Based on the speed of those AMS2500's, and the iops it will allow, i won't need to pair them off to a different RAID 10 group because Hitachi does a "wide stripe" with their pools where all the raid groups get put together and the luns are carved across the entire group. At least for this configuration I will be fine.

Yeah, we're probably undersized on the memory part - I might adjust that. Do you have what the configuration of yours is using per node? That may help me for my design.

Regarding the CPU - I agree that it was not built well for multi-threading. I have some Symantec documents that are not for customers (seeing as I am a consulting partner) that gives me a better idea of what to configure, why, etc.

I don't imagine that they'll need that much for 10g, but I find it more practical than having to create bonding with 1g ports.

 

Any idea when Puredisk 6.6.3 is coming out? Are they simply skipping 6.6.2?

 

This build won't occuring until 2012:Q2, so this is just some pre-planning.

 

I've done a few massive implementations before, but they were in older versions where the content routers couldnt even support 32TB yet. I've also done some implementations for customers, but they ignored my recommendations for system build & storage requirements, so of course they end up getting what they pay for. Either way, would be glad to talk to you to share notes and help build some more realistic documents & design recommendations for future customers and for the Symantec community.

 

The Tape out portion for us will only occur after an OST movement from production to DR, and then copying to tape from there. The "need for speed" is lower there, but i absolutely do look forward to improved hydration rates for puredisk nodes. I am curious if anyone using MSDP out there has seen the improvement with 7.1.0.2.

teiva-boy
Level 6

For this size environment, are you sure you want to use PD?  So many moving pieces with absolutely no way to predict performance vs a more mature appliance from an established appliance vendor...

 

austin_lazanows
Level 4
Certified

No thanks. I'm not a big fan of your EMC data domain product.