CPU Pooling
Abstract
CPU pooling, added to z/VM 6.3 by PTF, implements the notion of group capping of CPU consumption. Groups of guests can now be capped collectively; in other words, the capped or limited quantity is the amount of CPU consumed by the group altogether. The group of guests is called a CPU pool. The pool's limit can be expressed in terms of either a percentage of the system's total CPU power or of an absolute amount of CPU power. z/VM lets an administrator define several pools of limited guests. For each pool, a cap is defined for exactly one CPU type. The cappable types are CPs or IFLs.
CPU pooling is available on z/VM 6.3 with APAR VM65418.
Introduction
This article discusses the capabilities provided by CPU pooling. It also provides an overview of the monitor records that have been updated to include CPU pooling information and explains how those records can be used to understand CPU pooling's effect on the system. Further, the article demonstrates the effectiveness of CPU pooling using examples of workloads that include guests that are members of CPU pools. Finally, the article shows that the z/VM Control Program overhead introduced with CPU pooling is very small.
Background
This section explains how CPU pooling can be used to control the amount of CPU time consumed by a group of guests. It also summarizes the z/VM monitor records that have been updated to monitor limiting with CPU pools. Finally, the section explains how to use the monitor records to understand CPU pooling's effect on guests that are members of a CPU pool.
CPU Pooling Overview
A CPU pool can be created by using the DEFINE CPUPOOL command. The command's operands specify the following:
- The name of the CPU pool.
-
The pool's consumption limit.
There are two types of consumption
limits that can be set for a given CPU
pool:
- LIMITHARD limits the CPU pool to a specific percentage of the CPU power available of the specified CPU type.
- CAPACITY limits the CPU pool to a specific amount of CPU power of the specified CPU type.
- The type of CPU, either IFL or CP.
Guests can be assigned to a CPU pool by using the SCHEDULE command. The command's operands specify the following:
- The name of the CPU pool.
- The guest being assigned to the pool. Note that a given guest virtual machine can be assigned to only one CPU pool at a time.
To make a guest's CPU pool assignment permanent, and to make the guest always belong to a specific pool, place the SCHEDULE command in the guest's CP directory entry.
A CPU pool exists only from the time the DEFINE CPUPOOL command is issued until the z/VM system shuts down or DELETE CPUPOOL is issued against it. Further, the CPU pool must be defined before any SCHEDULE commands are issued against it. If a permanent CPU pool is desired, add the DEFINE CPUPOOL command to AUTOLOG1's PROFILE EXEC or add the command to an exec AUTOLOG1 runs. This will ensure that the CPU pool is created early in the IPL process before guests to be assigned to the group are logged on.
For more information about the DEFINE CPUPOOL and SCHEDULE commands, refer to z/VM CP Commands and Utilities Reference.
For more information about using CPU pools, refer to z/VM Performance.
Monitor Changes for CPU Pooling
The CPU Pooling enhancement added or changed the following monitor records:
-
MRMTRCPC: Domain 1 (MONITOR) Record 28 - a new CONFIGURATION record.
The new CPU Pool Configuration monitor record comes out whenever Monitor is started.
It also is given to every new guest that connects to Monitor.
The record
gives the current information about each CPU pool that is already defined on the system.
-
MRMTRCPD: Domain 1 (MONITOR) Record 29 - a new EVENT record.
The new CPU Pool Definition monitor record comes out whenever a DEFINE CPUPOOL, SET CPUPOOL,
or DELETE CPUPOOL command is issued.
The record
gives the current information about the CPU pool that just changed.
-
MRPRCCPU: Domain 5 (PROCESSOR) Record 19 - a new SAMPLE record.
The new CPU Pool Utilization monitor record comes out for each CPU pool defined on the
system.
The record
gives information describing how the CPU pool is
performing, how much CPU time it is consuming, and how often/long it is being limited.
-
MRUSECPC: Domain 4 (USER) Record 13 - a new EVENT record.
The new CPU Pool Change monitor record comes out whenever a guest is added to a CPU pool, moved
from one CPU pool to another, or removed from a CPU pool. It is produced when a SCHEDULE command
changes a guest's CPU pool, or the guest leaves a CPU pool by logging off, or a VMRELOCATE command moves a
guest in a CPU pool from one system to another.
The record
gives the previous and current CPU pool information for the
specific guest whose CPU pool has been changed.
-
MRUSEACT: Domain 4 (USER) Record 3 - an existing SAMPLE record.
The mapping of the User Activity Data monitor record is updated to add one field, the CPU pool name, to the
record.
-
MRSCLALL: Domain 2 (SCHEDULER) Record 13 - an existing EVENT record.
The mapping of the Add VMDBK To The Limit List monitor record is updated to add two fields to the
record. One field is a code added to indicate the reason why the guest is being put onto the limit
list, either because of hitting an individual limit or because of hitting the pool limit.
The second field is the name of the CPU pool to which the guest is assigned.
-
MRSCLDLL: Domain 2 (SCHEDULER) Record 14 - an existing EVENT record.
The mapping of the Drop VMDBK From The Limit List monitor record is updated to add one field,
the CPU pool name, to the record.
- MRMTRUSR: Domain 1 (MONITOR) Record 15 - an existing CONFIGURATION record. The mapping of the Logged on User monitor record is updated to add one field, the CPU pool name, to the record.
Using Monitor Records to Calculate CPU Utilization with CPU Pooling
The CPU utilization of a CPU pool is calculated using data contained in Domain 5 Record 19 (D5R19) and Domain 1 Record 29 (D1R29).
The difference between PRCCPU_LIMMTTIM values in consecutive D5R19 records provides the total CPU time consumed by CPU pool members during the most recently completed limiting interval.
The difference between PRCCPU_LIMMTODE values in consecutive D5R19 records provides the elapsed time of the most recently completed limiting interval.
In calculating CPU utilization percents, the PRCCPU_LIMMTODE delta must be used for the denominator; do NOT use the MRHDRTOD delta.
If between two D5R19 records there is an intervening D1R29 record with the value of x'01' (DEFINE CPUPOOL) in MTRCPD_COMMAND, the MRHDRTOD value of the D1R29 record must be used in place of the PRCCPU_LIMMTODE value from the previous D5R19 record to calculate the elapsed time of the most recently completed limiting interval.
The total CPU time consumed in the most recently completed limiting interval divided by the elapsed time of the most recently completed limiting interval gives the CPU utilization for the most recently completed limiting interval.
Using Monitor Records to Understand CPU Utilization Variations with CPU Pooling
The following monitor records provide information which will help to explain variations in expected CPU utilization.
-
MRSYTPRP: Domain 0 (SYSTEM) Record 2 - Processor Data (Per Processor) - an existing SAMPLE record.
If CPU pooling is being limited by CAPACITY, the intervening D0R2
records can be used to determine the number of CPs and IFLs online.
If the number of online processors of the type being limited is less than the
CAPACITY setting, no limiting will take place.
-
MRMTRCPD: Domain 1 (MONITOR) Record 29 - a new EVENT record.
If there is an intervening D1R29 record with the value of x'02' (SET CPUPOOL)
in MTRCPD_COMMAND, the LIMITHARD or CAPACITY setting of a CPU pool might have
changed, and the CPU utilization might not be what was expected.
If there is an intervening D1R29 record with the value of x'03' (DELETE CPUPOOL)
in MTRCPD_COMMAND, the CPU utilization might not be what was expected.
-
MRUSECPC: Domain 4 (USER) Record 13 - a new EVENT record.
If there is an intervening D4R13 record with USECPC_COMMAND containing
the value of
x'01' (SCHEDULE or VMRELOCATE added user to CPU pool), or
x'02' (SCHEDULE moved user from one CPU pool to another), or
x'03' (SCHEDULE removed user from CPU pool), or
x'04' (VMRELOCATE or LOGOFF removed user from CPU pool),
the CPU utilization may not be what was expected.
-
MRPRCVON: Domain 5 (PROCESSOR) Record 1 - Vary on Processor - an existing EVENT record.
If CPU pooling is being limited by CAPACITY, and the type of processor being varied on
matches the type being limited, the effects of limiting might change.
If the number of online processors of the type being limited is still less than
or equal to the CAPACITY setting, no limiting will take place and CPU utilization
will be limited by the number of processors online.
If the number of online processors of the type being limited is greater than
the CAPACITY setting, limiting will begin to take place.
- MRPRCVOF: Domain 5 (PROCESSOR) Record 2 - Vary Off Processor - an existing EVENT record. If CPU pooling is being limited by CAPACITY, and the type of processor being varied off matches the type being limited, the effects of limiting might change. If the number of online processors of the type being limited is still greater than the CAPACITY setting, limiting will continue to take place. If the number of online processors of the type being limited is less than or equal to the CAPACITY setting, limiting will stop taking place, and CPU utilization will be limited by the number of processors online.
Method
Virtual Storage Exerciser and Apache were used to create specialized workloads to evaluate the effectiveness of CPU pooling. Base measurements with no CPU pools were conducted to quantify the amount of CP overhead introduced by CPU pooling.
Virtual Storage Exerciser (VIRSTOEX) workload variations included the following:
-
Multiple guests in multiple CPU pools with each CPU pool having
the same LIMITHARD setting. The guests in each CPU pool had a
variety of individual share settings including some with ABSolute
SHARE LIMITHARD settings.
The CPU pool limit settings were designed to limit the total CPU consumption
by the guests in the pool to less than the total CPU consumption they
would have achieved without CPU Pooling.
In measurements where some guests also had individual limits, the
individual limits were designed to further limit their CPU consumption
to less than they would have achieved with just CPU Pooling limits.
-
Multiple guests in multiple CPU pools with each CPU pool having
different LIMITHARD settings. The guests in each CPU pool
had a
variety of individual share settings including some with ABSolute
SHARE LIMITHARD settings.
The CPU pool limit settings were designed to limit the total CPU consumption
by the guests in the pool to less than the total CPU consumption they
would have achieved without CPU Pooling.
In measurements where some guests also had individual limits, the
individual limits were designed to further limit their CPU consumption
to less than they would have achieved with just CPU pooling limits.
-
Multiple guests in multiple CPU pools with each CPU pool having
the same CAPACITY settings. The guests in each CPU pool had a
variety of individual share settings including some with ABSolute
SHARE LIMITHARD settings.
The CPU pool limit settings were designed to limit the total CPU consumption
by the guests in the pool to less than the total CPU consumption they
would have achieved without CPU Pooling.
In measurements where some guests also had individual limits, the
individual limits were designed to further limit their CPU consumption
to less than they would have achieved with just CPU pooling limits.
- Multiple guests in multiple CPU pools with each CPU pool having different CAPACITY settings. The guests in each CPU pool had a variety of individual share settings including some with ABSolute SHARE LIMITHARD settings. The CPU pool limit settings were designed to limit the total CPU consumption by the guests in the pool to less than the total CPU consumption they would have achieved without CPU pooling. In measurements where some guests also had individual limits, the individual limits were designed to further limit their CPU consumption to less than they would have achieved with just CPU pooling limits.
Apache workloads included the following variations. Note that none of the Apache workload variations included CPU pool guests that had individual limits. For these measurements, limiting was done with CPU pool limits only.
-
Multiple guests in multiple CPU pools with each CPU pool having
the same LIMITHARD specification.
The CPU pool limit settings were designed to limit the total CPU consumption
by the guests in the pool to less than the total CPU consumption they
would have achieved without CPU pooling.
-
Multiple guests in multiple CPU pools with each CPU pool having
the same CAPACITY specification.
The CPU pool limit settings were designed to limit the total CPU consumption
by the guests in the pool to less than the total CPU consumption they
would have achieved without CPU pooling.
- Multiple guests in multiple CPU pools with some having LIMITHARD specifications and some having CAPACITY specifications. The CPU pool limit settings were designed to limit the total CPU consumption by the guests in the pool to less than the total CPU consumption they would have achieved without CPU pooling.
Results and Discussion
Table 1 contains measurement results obtained with VIRSTOEX. VIRSTOEX workloads use CMS guest virtual machines. The table contains:
- Limit type for each of the 4 CPU pools
- Limit type for each individual guest in VIRSTOR group 3
- Maximum share setting for each individual guest in VIRSTOR group 3
- Maximum share setting for each of the 4 CPU pools
- ETR and ITR ratios for the entire workload
- Processor utilization
- Minimum, maximum, and average utilization of each CPU pool from the monitor intervals of the measurement
- Average processor utilization for each of the 4 VIRSTOR groups
These measurement results illustrate the following points:
-
The cost of using CPU pooling is very small.
The z/VM Control Program overhead incurred is indicated by the Internal Throughput Rate (ITR). ITR in Table 1 shows a comparison of ITR ratios to the base measurement that did not use CPU pools. The reduced ITR ratio is very small in the cases of the measurements conducted with CPU pools. In fact, it is so small that it could be attributed to measurement run variation.
-
CPU pools effectively limit their guest groups when limited with
either LIMITHARD or CAPACITY limiting.
The measurement results shown in Table 1 for the LIMITHARD-limited CPU pool case and the CAPACITY-limited CPU pool case show that CPU pool limiting is very effective.
-
CPU pools limit appropriately when CPU pool guests
have individual share limits.
The measurement results in Table 1 show that CPU pool guests with individual share limits were limited by the individual share limits because the individual share limits were lower than the CPU pool limits. In general, guests with individual share limits that are also members of a CPU pool are limited by the lower limit. In other words, if the CPU pool limit would limit a guest to less CPU than its individual share limit, the CPU pool limit is the controlling factor. On the other hand, if the guest's individual share limit is lower than the CPU pool limit, the guest is limited by its individual share limit.
-
CPU pooling can diminish the distribution of CPU time
based on relative share settings.
In the first measurement, without CPU Pooling, the four groups of guests consumed CPU time in proportion to their relative share settings. In the second and third measurements, the total CPU time consumed by VIRSTOR2, VIRSTOR3, and VIRSTOR4 groups was significantly diminished. In the fifth measurement, with VIRSTOR3 guests limited with individual limits and CPU pooling in effect, the other three groups of guests had nearly identical total CPU consumption.
Table 1. CPU Pooling - VIRSTOEX Measurement Comparisons | |||||
Run ID | STXH4080 | STXH4081 | STXH4082 | STXH4083 | STXH4084 |
CPUPOOL1 Limit Type | na | LIMITHARD | CAPACITY | na | CAPACITY |
CPUPOOL2 Limit Type | na | LIMITHARD | CAPACITY | na | CAPACITY |
CPUPOOL3 Limit Type | na | LIMITHARD | CAPACITY | na | CAPACITY |
CPUPOOL4 Limit Type | na | LIMITHARD | CAPACITY | na | CAPACITY |
VIRSTOR3 Limit Type | ... | ... | ... | Hard | Hard |
VIRSTOR3 Max Abs Shr | ... | ... | ... | 3 | 3 |
CPUPOOL1 Max Share | na | 16.00 | 150.0 | na | 100.0 |
CPUPOOL2 Max Share | na | 16.00 | 150.0 | na | 100.0 |
CPUPOOL3 Max Share | na | 16.00 | 150.0 | na | 100.0 |
CPUPOOL4 Max Share | na | 16.00 | 150.0 | na | 100.0 |
ETR Ratio | 1.000 | 0.628 | 0.742 | 1.017 | 0.555 |
ITR Ratio | 1.000 | 0.978 | 0.985 | 1.014 | 1.104 |
Total Util/Proc | 96.4 | 64.6 | 73.1 | 96.3 | 49.1 |
Emul Util/Proc | 92.3 | 60.3 | 68.4 | 92.3 | 45.2 |
CP Util/Proc | 4.1 | 4.3 | 4.7 | 4.0 | 3.9 |
System Util/Proc | 0.5 | 0.6 | 0.6 | 0.5 | 0.8 |
CPUPOOL1 Min %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL2 Min %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL3 Min %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL4 Min %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL1 Mean %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL2 Mean %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL3 Mean %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL4 Mean %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL1 Max %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL2 Max %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL3 Max %Util | na | 128.0 | 150.0 | na | 100.0 |
CPUPOOL4 Max %Util | na | 128.0 | 150.0 | na | 100.0 |
VIRSTOR1 Util | 19.950 | 21.737 | 21.383 | 25.333 | 25.367 |
VIRSTOR2 Util | 39.750 | 29.772 | 32.533 | 51.767 | 25.317 |
VIRSTOR3 Util | 57.600 | 35.789 | 41.800 | 22.867 | 20.433 |
VIRSTOR4 Util | 74.417 | 40.719 | 49.050 | 91.533 | 25.350 |
Notes: 2827-795, 8 dedicated CPs, 1 TB of storage. Four 8.0 Gbps Fibre-channel switched channels, 2107-E8 control unit, 34 3390-3 volumes. 4 VIRSTOR1 guests (20 GB, SHARE 100), 4 VIRSTOR2 guests (20 GB, SHARE 200), 4 VIRSTOR3 guests (20 GB, SHARE 300), 4 VIRSTOR4 guests (20 GB, SHARE 400). VIRSTOR parms (C=3 I=4 T=600). z/VM 6.3 of April 8, 2014. |
Table 2 contains measurement results obtained with Apache web serving. Apache measurements were done with Linux guest virtual machines. The Linux client machines make requests to the Linux server machines. The Linux server machines serve web pages that satisfy the client requests. The table contains:
- Limit type for each of the 4 CPU pools
- Maximum share setting for each of the 4 CPU pools
- Minimum, maximum, and average utilization of each CPU pool from the monitor intervals of the measurement
- Processor utilization
- Percentage of high frequency wait samples that found APACHE servers on the Limit list
- ETR and ITR ratios for the entire workload
- Processor time per transaction for the AWM clients and for the Apache servers
- T/V ratio
These measurement results illustrate the following points:
-
The cost of using CPU pooling is very small.
With the Apache web serving workload, ITR ratio is not the appropriate metric to illustrate this point. In fact, observe in the table that ITR ratio actually increases in the measurements that include CPU pools. The characteristics of the Apache web serving workload result in an increase in ITR ratio most likely because the Linux guest time (emulation time) is reduced when the servers are limited because of their pool memberships. Instead, looking at CP usec/tx and the T/V ratio shows us that the overhead of using CPU pooling is very small.
-
CPU pools effectively limit their guest groups when limited with
either LIMITHARD or CAPACITY limiting.
The measurement results shown in Table 2 for the LIMITHARD-limited CPU pool case, the CAPACITY-limited CPU pool case, and the case that includes both CAPACITY and LIMITHARD CPU pools show that CPU pool limiting is very effective.
Table 2. CPU Pooling - Apache Measurement Comparisons | ||||
Run ID | A8XT4083 | A8XT4080 | A8XT4081 | A8XT4082 |
CPUPOOL1 Limit Type | na | LIMITHARD | CAPACITY | LIMITHARD |
CPUPOOL2 Limit Type | na | LIMITHARD | CAPACITY | LIMITHARD |
CPUPOOL3 Limit Type | na | LIMITHARD | CAPACITY | CAPACITY |
CPUPOOL4 Limit Type | na | LIMITHARD | CAPACITY | CAPACITY |
CPUPOOL1 Max Share | na | 4.999 | 40.00 | 4.999 |
CPUPOOL2 Max Share | na | 4.999 | 40.00 | 4.999 |
CPUPOOL3 Max Share | na | 4.999 | 40.00 | 40.00 |
CPUPOOL4 Max Share | na | 4.999 | 40.00 | 40.00 |
CPUPOOL1 Max %Util | na | 39.99 | 40.00 | 39.99 |
CPUPOOL2 Max %Util | na | 39.99 | 40.00 | 39.99 |
CPUPOOL3 Max %Util | na | 39.99 | 40.00 | 40.00 |
CPUPOOL4 Max %Util | na | 39.99 | 40.00 | 40.00 |
CPUPOOL1 Mean %Util | na | 38.98 | 39.05 | 39.04 |
CPUPOOL2 Mean %Util | na | 38.96 | 39.06 | 39.02 |
CPUPOOL3 Mean %Util | na | 38.96 | 39.05 | 39.04 |
CPUPOOL4 Mean %Util | na | 38.97 | 39.07 | 39.04 |
CPUPOOL1 Min %Util | na | 9.493 | 11.53 | 11.41 |
CPUPOOL2 Min %Util | na | 9.033 | 11.63 | 10.80 |
CPUPOOL3 Min %Util | na | 8.982 | 11.53 | 11.15 |
CPUPOOL4 Min %Util | na | 9.287 | 12.07 | 11.09 |
Total Util/Proc | 1.000 | 0.679 | 0.683 | 0.684 |
APACHE LIMIT Wait | 0 | 25 | 25 | 24 |
ETR Ratio | 1.000 | 0.735 | 0.740 | 0.739 |
ITR Ratio | 1.000 | 1.087 | 1.090 | 1.086 |
CP usec/Tx | 1.000 | 1.076 | 1.067 | 1.067 |
Emul usec/Tx | 1.000 | 0.891 | 0.889 | 0.893 |
AWM Emul CPU/Tx | 1.000 | 0.899 | 0.899 | 0.899 |
APACHE Emul CPU/Tx | 1.000 | 0.800 | 0.800 | 0.800 |
APACHE CP CPU/Tx | 1.000 | 1.000 | 1.000 | 1.000 |
AWM CP CPU/Tx | 1.000 | 0.875 | 1.000 | 1.000 |
T/V Ratio | 1.000 | 1.034 | 1.034 | 1.034 |
Notes: 2827-795, 8 dedicated CPs, 128 GB of storage. Four 8.0 Gbps Fibre-channel switched channels, 2107-E8 control unit, 224 3390-54 volumes. z/VM 6.3 of April 8, 2014. 6 AWM clients, 16 Apache servers, 2 URL files, 15 KB avg URL size. |
Summary and Conclusions
- CPU pools provide effective CPU time limiting, regardless of the type of guest.
- CPU pools limit effectively for both types of limits - LIMITHARD or CAPACITY.
- CPU pools limit appropriately when a pool contains guests that have individual limits as well. Guest individual limits take effect only if they limit the guest more than the CPU pool limit would.
- CPU pooling can diminish the distribution of CPU consumption based on relative share settings.
- CPU pooling has very little additional resource consumption.