How ZFS Cache Impacts NBU Performance
Problem: Solaris 10 ZFS ARC Cache configured as default can gradually impact NetBackup performance at Memory level, forcing NBU to use a lot of Swap memory even when there are several Gig's of RAM "Available", in the following Solaris 10 server we initially see that 61% of the memory is own by ZFS File Data (ARC Cache) # echo ::memstat | mdb -k Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 1960930 15319 24% ZFS File Data 5006389 39112 61% Anon 746499 5832 9% Exec and libs 37006 289 0% Page cache 22838 178 0% Free (cachelist) 342814 2678 4% Free (freelist) 103593 809 1% Total 8220069 64219 Physical 8214591 64176 Using ARChits.sh script we can see how often the OS hits or requests memory from ARC Cache, in our sample is in a 100%, meaning we have a middle man between NBU and the Physical Memory. # ./ARChits.sh HITS MISSES HITRATE 2147483647 692982 99.99% 518 4 99.23% 2139 0 100.00% 2865 0 100.00% 727 0 100.00% 515 0 100.00% 700 0 100.00% 2032 0 100.00% 4529 0 100.00% 1040 0 100.00% … … To know which processes are the ones hitting ARC Cache or requesting memory we use dtrace to count the number of positive and missed hits. # dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }' ... ... nbproxy 1099 nbpem 1447 nscd 1649 bpstsinfo 1785 find 1806 fsflush 2065 bpclntcmd 2257 bpcompatd 2394 perl 2945 bpimagelist 4019 bprd 4268 avrd 8899 grep 9249 dbsrv11 20782 bpdbm 37955 As We can see dbsrv11 and bpdbm and the main consumers of ARC Cache memory. Next step is to know the memory requests sizes in order to measure the impact of ARC Cache to NBU requests this because ARC Cache nature of slicing the memory in small blocks. # dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }' bytes value ------------- Distribution ------------- count 256 | 0 512 |@@@@@ 10934 1024 | 1146 2048 | 467 4096 | 518 8192 |@@@@ 9485 16384 |@ 1506 32768 | 139 65536 | 356 131072 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 67561 262144 | 0 Majority of memory requests are 128KB (131072) block sizes and some few are very small, this when there are no major requests at NBU level. Things change when a lot of NBU requests come in, suddenly raising small blocks requests. Following output shows a Master pulling some data running several vmquery commands. # dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }' bytes value ------------- Distribution ------------- count 256 | 0 512 |@@@@@@@@@@@@ 78938 1024 |@ 7944 2048 | 1812 4096 |@ 3751 8192 |@@@@@@@@@@@@ 76053 16384 |@ 9030 32768 | 322 65536 | 992 131072 |@@@@@@@@@@@@ 77239 262144 | 0 vmquery drains all the memory requests plus the OS is force to rehydrate the memory in to bigger blocks in order to meet NBU block sizes requirements, impacting the application performance mainly at NBDB or EMMDB levels. # dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }' ... ... avrd 1210 bpimagelist 2865 dbsrv11 2970 grep 4971 bpdbm 6662 vmquery 94161 The memory rehydration forces the OS to use a lot of Swap memory even when there is a lot available under "ZFS File Data" # vmstat 1 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy cs us sy id 0 0 0 19244016 11342680 432 1518 566 604 596 0 0 8 -687 8 -18 8484 30088 9210 10 5 84 0 2 0 11441128 3746680 44 51 8 23 23 0 0 0 0 0 0 6822 19737 7929 9 3 88 0 1 0 11436168 3745440 14 440 8 23 23 0 0 0 0 0 0 6460 18428 7038 9 4 87 0 2 0 11440808 3746856 6 0 15 170 155 0 0 0 0 0 0 6463 18163 6996 9 4 87 0 2 0 11440808 3747000 295 822 15 147 147 0 0 0 0 0 0 7604 27577 8989 11 5 84 0 1 0 11440552 3746872 122 683 8 70 70 0 0 0 0 0 0 5926 20430 6444 9 3 88 In this case there are 39GB of RAM Allocated for ZFS File Data (ARC Cache) that are supposed to be free in case any App needs it, problem is ARC Cache nature to slice the memory in small pieces and when the OS takes away some of the memory it takes long time to respond to any memory request. # echo ::memstat | mdb -k Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 1960930 15319 24% ZFS File Data 5006389 39112 61% Anon 746499 5832 9% Exec and libs 37006 289 0% Page cache 22838 178 0% Free (cachelist) 342814 2678 4% Free (freelist) 103593 809 1% Total 8220069 64219 Physical 8214591 64176 When the Master is rebooted initially there is no ZFS File Data allocation and NBU runs perfectly, the master performance degrades slowly depending on how fast the ARC Cache eats the memory. # echo ::memstat | mdb -k Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 479738 3747 6% Anon 422140 3297 5% Exec and libs 45443 355 1% Page cache 83530 652 1% Free (cachelist) 2200908 17194 27% Free (freelist) 4988310 38971 61% Total 8220069 64219 Physical 8214603 64176 Solution: We ran into this problem quite often with heavily loaded systems using Solaris 10 & ZFS. To address the problem, we limited the ZFS ARC cache on each problematic system. To determine the limit value we followed below procedure. NOTE: As with any changes of this nature, please bear in mind that the setting may have to be tweaked to accommodate additional load and/or memory changes. Just monitor and adjust as needed. 1. After system is fully loaded and running backups, sample the total memory use: Example: prstat -s size -–a NPROC USERNAME SWAP RSS MEMORY TIME CPU 32 sybase 96G 96G 75% 42:38:04 0.2% 72 root 367M 341M 0.3% 9:38:11 0.0% 6 daemon 7144K 9160K 0.0% 0:01:01 0.0% 1 smmsp 2048K 6144K 0.0% 0:00:22 0.0% 2. Compare percentage of memory in use to total physical memory: prtdiag | grep -i Memory Memory size: 131072 Megabytes 3. In the above example, approx 75% of the physical memory is used under typical load. Add a few percent for headroom (let’s call it 80). 4. 20% of 128GB is 26GB = 27917287424 bytes 5. Configure ZFS ARC Cache limit in /etc/system set zfs:zfs_arc_max=27917287424 6) Reboot system References: https://forums.oracle.com/thread/2340011 http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache http://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/
Published 12 years ago
Version 1.0