IBM: Measuring and Understanding Memory Overcommitment in z/VM

Memory Overcommitment

Virtualization is often about over committing resources. As such, there will be discussions about this. First, one needs to agree on the definition of overcommitment. In this document, we will define it as the total of the virtual memory of the started (logged on) virtual machines to the total real memory available to the z/VM system.

For example, if there is 10GB of real memory available to z/VM and there are 15 virtual machines, each 1GB, that are started, then the ratio of virtual to real would be 15:10, or 1.5:1. Some will simplify this by saying 1.5.

Guidelines for Acceptable Ratios

There are a number of factors to determine what is an acceptable ratio. These include:

What percentage of virtual machines are idle/active?
How much is shared memory exploited?
Were the virtual machines defined with the appropriate amount of virtual memory?
What memory management capabilities are being exploited (e.g. Cooperative Memory Management)?
What sort of Service Level Agreements (SLAs) are there? Do they require things like 100% of transactions meeting a threshold or less restrictive like 99.9%?
The performance characteristics of the z/VM paging configuration. Are you using all-flash storage? Multiple channels? HyperPAV and zHPF?

Some guidelines to keep in mind are:

When planning whether memory can be overcommitted in a z/VM LPAR, the most important thing is to understand the usage pattern and characteristics of the applications, and to plan for the peak period of the day. This will allow you to plan the most effective strategy for utilizing your z/VM system's ability to overcommit memory while meeting application-based business requirements.
For z/VM LPARs where all started guests are heavily-used production WAS servers that are constantly active, caution should be taken when overcommitment of memory is attempted.
In other cases where started guests experience some idle time, overcommitment of memory is possible.
As discussed, how much you can reasonable overcommit is dependent on a number of factors. While exceptions exist, if you overcommit more than 1.8 to 1 for production workloads or more than 3 to 1 for test environments, you should do detailed analysis to ensure adequate performance.

Determining Impact of Memory Overcommitment

While you can plan for a particular level of overcommitment, it is valuable to validate this and measure any impact. There are some very simple tools available to do this. VIR2REAL is one of these tools. It is a simple exec that can be downloaded to provide a point in time ratio for the started virtual machines. One aspect to note is that it includes VDisk virtual memory as well. A link is included in the reference section of this article.

For more detailed analysis, you may want to use a performance monitor such as Performance Toolkit. We will briefly describe some of the key fields here and how to interpret them. For historic reasons, there are time these reports refer to 'storage' when they are really discussing 'memory'.

FCX113 UPAGE - User Paging Activity Report

For each virtual machine, the Toolkit UPAGE report shows both the location of the pages making up the virtual machine and the movement of those pages. Let's look at the following example.

Data <--------- Paging Activity/s ----------> Spaces <Page Rate> Page <--Page Migration--> Userid Owned Reads Write Steals >2GB> X>MS MS>X X>DS LNUDB5 .0 18.1 46.2 .0 .0 30.5 47.9 46.3 LNUDB4 .0 16.1 9.9 .0 .0 24.6 36.5 10.3 LNUDB3 .0 14.3 12.4 .0 .0 35.0 44.3 12.0 LNIHS3 .0 13.4 .8 .0 .0 4.4 20.2 .5 LNUDB2 .0 13.2 7.1 .0 .0 23.6 34.1 6.9 LNIHS5 .0 12.4 26.2 .0 .0 8.1 20.8 28.3 LNWAS5 .0 11.4 31.7 .0 .0 .9 41.5 29.8 LNIHS2 .0 9.9 17.8 .0 .0 3.4 19.9 19.2 LNIHS4 .0 9.5 1.1 .0 .0 6.4 16.9 1.2 LNIHS1 .0 9.3 4.0 .0 .0 5.4 19.6 4.2 LNWAS4 .0 8.5 59.3 .0 .0 3.2 34.0 59.1 LNUDB1 .0 7.5 8.9 .0 .0 33.1 41.3 9.0 <----------------- Number of Pages -----------------> <-Resident-> <--Locked-> Stor Userid WSS Resrvd R<2GB R>2GB L<2GB L>2GB XSTOR DASD Size LNUDB5 18283 0 1096 17416 69 160 62498 49803 1800M LNUDB4 19997 0 992 19243 41 195 62552 45815 1800M LNUDB3 18585 0 808 18024 71 170 74444 36532 1800M LNIHS3 9316 0 1844 7876 114 292 12559 43340 750M LNUDB2 22580 0 850 21962 56 169 56431 51688 1800M LNIHS5 8499 0 1628 7283 74 339 16199 37563 750M LNWAS5 311657 0 123148 188720 77 135 17340 62507 2048M LNIHS2 8904 0 1667 7643 54 340 15076 37092 750M LNIHS4 7719 0 1348 6778 79 333 15985 40486 750M LNIHS1 9538 0 1525 8443 114 316 15988 38038 750M LNWAS4 310012 0 95342 214903 77 130 21697 62252 2048M LNUDB1 20007 0 900 19339 83 145 71134 38778 1800M The report above shows several virtual machines for this system. The two largest are defined as 2GB as seen as 2048M in the "Stor Size" field. Looking at LNWAS4 (second from last), you see its pages are located in real memory ("R<2GB" and "R>2GB"), expanded storage ("XSTOR"), and DASD ("DASD"). The "R<2GB" and "R>2GB" fields describe the number of pages of the virtual machine that are resident in real memory below or above 2GB.

The "Paging Activity" section shows the rate (per second) of page movements between the three locations. The movements that have the greatest impact tend to be the read type operations, as these are the cases where z/VM control program needs to make pages resident in order for the virtual machine to run. For Linux guests, some page fault requests may not stop the virtual machine as Linux and z/VM handshake on some of these requests to allow Linux to run other processes when appropriate. The read type requests would be reflected in the "Page Read" and "Page Migration X>MS" (page from expanded storage to main storage).

FCX114 USTAT - User State Sampling Report

The User State Sampling Report is useful for determining delays to virtual machines. A subset of it is shown below:

Userid %ACT %RUN %CPU %PGW %IOW %SIM %TIW %CFW %TI %PGA %LIM %OTH >System< 44 39 17 0 0 3 41 0 0 0 0 0 LNIHS1 100 16 14 0 0 2 68 0 0 0 0 0 LNUDB1 100 48 19 0 0 4 29 0 0 0 0 0 LNUDB2 100 45 21 0 0 7 28 0 0 0 0 0 LNUDB3 100 45 23 0 0 4 28 0 0 0 0 0 LNUDB4 100 43 25 0 0 4 28 0 0 0 0 0 LNUDB5 100 44 19 0 0 3 34 0 0 0 0 0 LNWAS1 100 64 21 0 0 3 12 0 0 0 0 0 LNWAS2 100 64 19 0 0 4 13 0 0 0 0 0 LNWAS3 100 66 17 0 0 2 14 0 0 0 0 0 LNWAS4 100 66 19 0 0 3 11 0 0 0 0 0 LNWAS5 100 67 18 0 0 3 12 0 0 0 0 0 This report is based of the user high-frequency state sampling. So the metrics are really based off of 'samples', not measured time. The sampling can be affected by various commands and other factors. This report shows the percent of samples that a virtual machine was in various states. The percentage here is out of the samples the virtual machine was shown as active, not out of wall clock time. The percentage of samples the virtual machine was active is shown as %ACT. The two states of most interest to memory overcommitment are:

%PGW - percentage of samples depicting waiting on synchronous page faults.
%PGA - percentage of samples depicting waiting on asychronous page faults.

There may also be other impacts from paging that are secondary. For example, if the control program is busy doing page write operations, the CPU may not be available to run virtual machines and this could result in increased %CPU. Since the page wait states are zero for these guests, the impact to throughput is minimal, even though there is paging activity.

FCX143 PAGELOG - Total Paging Activity

The final report discussed here is the PAGELOG report, which shows overall z/VM system paging information over time. Again, we show a couple subsets of the report. You'll see that there are three main sections. One for each of the three key areas: Expanded Storage, Real Storage, and Paging DASD.

<----------- Expanded Storage -----------> <-Real Stor-> Fast- Est. Page DPA Est. Interval Paging PGIN Path PGOUT Total Life Migr Pagable Page End Time Blocks /s % /s /s sec /s Frames Life >>Mean>> 524273 291.6 86.4 588.4 880.1 891 294.0 1793761 2031 09:48:29 524276 526.5 88.2 1159 1685 452 643.4 1793544 995 09:49:29 524276 389.7 88.1 739.7 1129 709 347.3 1793588 1658 09:50:29 524276 256.0 83.6 504.8 760.8 1039 245.6 1793684 2360 09:51:29 524276 372.6 88.7 668.2 1041 785 289.8 1793640 1869 09:52:29 524276 232.3 85.3 529.1 761.4 991 301.1 1794134 2079 <----------- Paging to DASD ------------> <-Single Reads--> Interval Reads Write Total Shrd Guest Systm Total End Time /s /s /s /s /s /s /s >>Mean>> 165.8 294.6 460.4 49.4 15.7 .0 15.7 09:48:29 319.2 642.3 961.5 69.8 62.8 .1 62.9 09:49:29 169.2 341.9 511.1 54.0 23.9 .0 23.9 09:50:29 115.8 255.0 370.8 50.4 12.8 .0 12.8 09:51:29 173.6 291.1 464.7 51.0 12.8 .0 12.8 09:52:29 220.4 333.7 554.0 44.7 7.1 .0 7.1 This report gives overall paging rates to or from expanded storage and to or from DASD. Note, CP will never move pages from DASD to expanded storage. Therefore, there is a single field (Page Migr) which relects migrating pages from expanded storage to DASD. One of the guidelines associated with memory is that the paging activity to/from expanded storage should be higher than the paging activity to/from DASD. In our example above, the expanded storage total rate averages 880.1/second compared to 460.4/second to DASD. This is a well behaving system since the expanded storage rate is higher.

References

The following are links to additional reference material:

Understanding and Tuning z/VM Paging This article describes the basic workings of z/VM paging, the metrics that describe the performance, and some of the tuning options.
VIR2REAL Tool A very simple tool to compute Logged On Virtual to Real Memory Ratio.

Back to the Performance Tips Page