Contents | Previous | Next

Specialty Engine Enhancements

Abstract

z/VM 5.4 provides support for the new z/VM-mode logical partition available on the z10 processor. A partition of this mode can include zAAPs (IBM System z10 Application Assist Processors), zIIPs (IBM System z10 Integrated Information Processors), IFLs (Integrated Facility for Linux processors), and ICFs (Internal Coupling Facility processors), in addition to general purpose CPs (central processors).

A virtual configuration can now include virtual IFLs, virtual ICFs, virtual zAAPs, and virtual zIIPs in addition to virtual general purpose CPs. These types of virtual processors can be defined by issuing the DEFINE CPU command or by placing the DEFINE CPU command in the directory.

According to settings established by the SET CPUAFFINITY command for the given virtual machine, z/VM either dispatches virtual specialty engines on real CPUs that match their types (if available) or simulates them on real CPs.

A new SET VCONFIG MODE command lets a user set a virtual machine's mode to one appropriate for the guest operating system. The SET SHARE command now allows settings by CPU type.

On system configurations where the CPs and specialty engines are the same speed, performance results are similar whether virtual specialty engines are dispatched on real specialty engines or simulated on CPs. On system configurations where the specialty engines are faster than CPs, performance results are better when using the faster specialty engines and scale correctly based on the relative processor speed.

CP monitor data and Performance Toolkit for VM provide information relative to the specialty engines.

Introduction

This article provides general observations about performance results when using the zAAP, zIIP, IFL, and ICF engines. The central result of our study is that performance results were always consistent with the speed and number of engines provided to the application. However, without proper balance between the LPAR, z/VM, and guest settings, a system can have a large queue for one processor type while other processor types remain idle. Accordingly, this article illustrates the performance information available for effective use of specialty engines.

The purpose of the z/VM-mode partition is to allow a single z/VM image to support a broad mixture of workloads. This is the only LPAR mode that allows IFL and ICF processors to be defined in the same partition as zIIP and zAAP processors. Different virtual configuration modes are necessary to enable the desired processor combinations for individual virtual machines. The SET VCONFIG MODE command was introduced to establish these virtual machine configurations. Once a configuration has been established the CP DEFINE CPU can be used to create the desired combination of virtual processors. Valid combinations of processors types and VCONFIG MODE settings are defined in z/VM: Running Guest Operating Systems. Other LPAR setting and zVM setting can affect the actual performance characteristics of these virtual machines. Details of these affects are include in the Results section.

Because z/VM virtualizes the z10's z/VM-mode partition, a guest can be defined with a VM mode on a z9 and affinity will be suppressed for any specialty engines not supported by the real LPAR.

On some System z models, the specialty engines are faster than the primary engines. Specialty Engine Support describes how to identify the relative speeds.

This article contains examples of both Performance Toolkit data and z/OS RMF data. Terminology for processor type has varied in both and includes: CP for Central Processors; IFA, AAP, and ZAAP for zAAP; IIP, and ZIIP for zIIP; and ICF and CF for ICF.

New z/VM monitor data available with the specialty engine support is described in z/VM 5.4 Performance Management.

The specialty engine support that existed prior to z/VM 5.4 is described in Specialty Engine Support. Because that writeup is still valid for non-z/VM-mode logical partitions, this new article will deal mostly with the new z/VM-mode logical partition, in which all specialty processors (zIIP, zAAP, IFL, and ICF) can coexist with standard (CP) processors.

Method

The specialty engine support was evaluated using z/OS guest virtual machines with four separate workloads plus a Linux guest virtual machine with one workload.

A z/OS JAVA workload described in z/OS JAVA Encryption Performance Workload provided use of zAAP processors. Workload parameters were chosen to maximize the amount of zAAP-eligible processing and the specific values are not relevant to the discussion. This workload will run a processor at 100% utilization and is mostly eligible for a zAAP.

A z/OS IPSec workload described in z/OS IP Security Performance Workload provided use of zIIP processors. Workload parameters were chosen to maximize the amount of zIIP-eligible processing and the specific values are not relevant to the discussion. It is capable of using about 100% of a zIIP processor.

A z/OS zFS workload described in z/OS File System Performance Tool was used in a sysplex configuration to provide use of ICF processors. Workload parameters were chosen to maximize the amount of ICF-eligible processing and the specific values are not relevant to the discussion. Three separate guests were used in the configuration for this workload. One z/OS guest was used for the application, another z/OS guest contained the zFS files that were being requested by the application, and a coupling facility was active to connect the two z/OS systems. This workload configuration is capable of using about 60% of an ICF processor.

A z/OS SSL Performance Workload described in z/OS Secure Sockets Layer (System SSL) Performance Workload provided utilization of the CP processors. Workload parameters were chosen to maximize the amount of CP-eligible processing and the specific values are not relevant to the discussion. It is capable of using all the available CP processors.

A Linux OpenSSL workload described in Linux OpenSSL Exerciser provided use of IFL processors. Workload parameters were chosen to maximize the amount of IFL-eligible processing and the specific values are not relevant to the discussion. This workload is capable of using all available IFL processors.

The workloads were measured independently and together in many different configurations. The workloads were measured with and without specialty engines in the real configuration. The workloads were measured with and without specialty engines in the virtual configuration. The workloads were measured with all available SET CPUAFFINITY values (ON, OFF, and Suppressed). The workloads were measured with all available SET VCONFIG MODE settings. The workloads were also measured with z/OS and Linux running directly in an LPAR. Measurements of individual workloads were used to verify quantitative performance results. Measurements involving multiple workloads were used to evaluate the various controlling parameters and to demonstrate the available performance information but not for quantitative results.

This article will deal mostly with the controlling parameters and the available performance information rather than the quantitative results.

Results and Discussion

Effectively using a z/VM-mode LPAR requires attention to LPAR settings such as weight, sharing, and capping. It also requires attention to z/VM settings, such as SET VCONFIG MODE, SET SHARE, and SET CPUAFFINITY. Finally, it requires attention to guest definition, namely, in selecting guest virtual processor types. In these results we illustrate how LPAR, z/VM, and guest settings interact and show examples of performance data relevant to each.

Specialty Engines from a LPAR Perspective

A z10 z/VM-mode LPAR can have a mixture of central processors and all types of specialty processors. Processors can be dedicated to the LPAR or they can be shared with other LPARs. The LPAR cannot contain a mixture of dedicated and shared processors. For LPARs with shared processors, the LPAR weight is used to determine the capacity factor for each processor type in the z/VM-mode LPAR. The weight can be different for each processor type. Shared processors can be capped or non-capped. Capping can be selected by processor type. Capped processors cannot exceed their defined capacity factor but non-capped processors can use excess capacity from other LPARs.

Quantitative results can be affected by how the processors are defined for the z/VM-mode LPAR. With dedicated processors, the LPAR gets full utilization of the processors. With shared processors, the LPAR's capacity factor is determined by the LPAR weight, the total weights for each processor type, and the total number of each processor type. If capping is specified, the LPAR cannot exceed its calculated capacity factor. If capping is not specified, the LPAR competes with other LPARs for unused cycles by processor type.

Here is an example (run E8430VM1) excerpt of the Performance Toolkit LPAR screen for the EPRF1 z/VM-mode LPAR with dedicated CP, zAAP, zIIP, IFL, and ICF processors. It shows 100% utilization regardless of how much is actually being used by z/VM because it is a dedicated partition. The actual workload used nearly 100% of the zAAP and IFL processors, about half of the CP and ICF processors, but almost none of the zIIP processor.

 Partition #Proc Weight Wait-C Cap %Load CPU %Busy  Type
 EPRF1        11    DED    YES  NO  19.6   0 100.0  CP
                    DED         NO         1 100.0  CP
                    DED         NO         2 100.0  CP
                    DED         NO         3 100.0  CP
                    DED         NO         4 100.0  ZAAP
                    DED         NO         5 100.0  ZIIP
                    DED         NO         6 100.0  ICF
                    DED         NO         7 100.0  IFL
                    DED         NO         8 100.0  IFL
                    DED         NO         9 100.0  IFL
                    DED         NO        10 100.0  IFL

Here is an example (run E8415CF1) excerpt of the Performance Toolkit LPAR screen showing a shared capped z/VM-mode LPAR with CP, zAAP, zIIP, IFL, and ICF processors. The capped weight is the same for all engine types. However, because the total weight and number of processors varies by processor type, the actual capacity is not the same for all processor types. The actual workload in this example could not exceed the allocated capacity for any engine type so the workload was not limited by the capping.

 Partition  #Proc Weight Wait-C Cap %Load CPU %Busy  Type
 EPRF1          8     80     NO YES   5.9   0  65.0  CP
                      80        YES         1  65.0  CP
                      80        YES         2  65.0  CP
                      80        YES         3  64.8  CP
                      80        YES         4    .0  ZAAP
                      80        YES         5    .0  ZIIP
                      80        YES         6  70.3  ICF
                      80        YES         7    .0  IFL
 
 Summary of physical processors:
 Type  Number  Weight  Dedicated
 CP        34     170          0
 ZAAP       2     120          0
 IFL       16     120          0
 ICF        2     110          0
 ZIIP       2     120          0
 

Here is an example (run E8730BS1) excerpt of the Performance Toolkit LPAR screen showing a shared capped z/VM-mode LPAR with CP, zAAP, zIIP, IFL, and ICF processors. The capped weight is the same for all engine types. However, because the total weight and number of processors varies by processor type, the actual capacity is not the same for all processor types. The actual workload in this example is limited by the capped capacity of the zIIP processor. The capped capacity for zIIP processors is 27% (3 processors times a weight of 5 divided by the total weight of 55).

FCX126  Run 2008/07/30 15:05:40         LPAR
                                        Logical Partition Activity
 
 Partition #Proc Weight Wait-C Cap %Load CPU %Busy  Type
 EPRF1         8      5     NO YES   1.8   0  18.1  CP
                      5        YES         1  17.1  CP
                      5        YES         2  16.4  CP
                      5        YES         3  15.8  CP
                      5        YES         4    .7  ZAAP
                      5        YES         5  28.4  ZIIP
                      5        YES         6   1.1  ICF
                      5        YES         7    .8  IFL
 
 
 Summary of physical processors:
 Type  Number  Weight  Dedicated
 CP        34     105          0
 ZAAP       2      45          0
 IFL       16     105          0
 ICF        1       5          0
 ZIIP       3      55          0
 

Specialty Engines from a z/VM Perspective

The CPUAFFINITY value is used to determine whether simulation or virtualization is desired for a guest's specialty engines. With CPUAFFINITY ON, z/VM will dispatch a user's specialty CPUs on real CPUs that match their types. If no matching CPUs exist in the z/VM-mode LPAR, z/VM will suppress this CPUAFFINITY and simulate these specialty engines on CPs. With CPUAFFINITY OFF, z/VM will simulate specialty engines on CPs regardless of the existence of matching specialty engines. Although IFLs can be the primary processor in some modes of LPARs, they are always treated as specialty processors in a z/VM-mode LPAR.

z/VM's only use of specialty engines is to dispatch guest virtual specialty processors. Without any guest virtual specialty processors, z/VM's real specialty processors will appear nearly idle in both the z/VM monitor data and the LPAR data. Interrupts are enabled, though, so their usage will not be absolute zero.

The Performance Toolkit SYSCONF screen was updated to provide information about the processor types and capacity factor by processor type.

Here is an example (run E8415CF1) excerpt of the Performance Toolkit SYSCONF screen showing a shared capped z/VM-mode LPAR with CP, zAAP, zIIP, IFL, and ICF processors. The capped weight is the same for all engine types. However, because the total weight and number of processors varies by processor type, the capacity factor is not identical for all processor types and the LPAR will not allow the capacity of any processor type to exceed its capped capacity.

FCX180  Run 2008/04/15 18:51:44   SYSCONF
                                  System Configuration, Initial and Changed
___________________________________________________________________________
 
 Log. CP  : CAF   117, Total  4, Conf  4, Stby  0, Resvd  0, Ded  0, Shrd  4
 Log. ZAAP: CAF   666, Total  1, Conf  1, Stby  0, Resvd  0, Ded  0, Shrd  0
 Log. IFL : CAF   117, Total  4, Conf  4, Stby  0, Resvd  0, Ded  0, Shrd  4
 Log. ICF : CAF   727, Total  1, Conf  1, Stby  0, Resvd  0, Ded  0, Shrd  0
 Log. ZIIP: CAF   666, Total  1, Conf  1, Stby  0, Resvd  0, Ded  0, Shrd  0

The Performance Toolkit PROCLOG screen was updated to provide the processor type for each individual processor and to include averages by processor type.

Here is an example (run E8430VM1) excerpt of the Performance Toolkit PROCLOG screen showing the utilization of the individual processors and the average utilization by processor type. This data is consistent with the LPAR-reported utilization for this measurement which is shown as an example in Performance Toolkit data. The actual workload in this example included a Linux guest to use the IFL processors, z/OS SYSPLEX guests (two z/OS guests and a Coupling Facility guest) to use the ICF and CP processors, and a z/OS guest to use the zAAP processors. There is no guest with a virtual zIIP so the only zIIP usage is z/VM interrupt handling. CPUAFFINITY is ON for all of these guest machines. The actual workload used nearly 100% of the zAAP and IFL processors, about half of the CP and ICF processors but almost none of the zIIP processor. These values are consistent with the workload characteristics.

FCX144  Run 2008/04/30 21:19:57         PROCLOG
                                        Processor Activity, by Time
 
                  <------ Percent Busy ----
           C
 Interval  P
 End Time  U Type Total  User  Syst  Emul
 >>Mean>>  0 CP    58.4  56.1   2.3  46.6
 >>Mean>>  1 CP    58.1  56.3   1.7  47.2
 >>Mean>>  2 CP    56.9  55.2   1.6  46.2
 >>Mean>>  3 CP    57.1  55.4   1.7  46.4
 >>Mean>>  4 ZAAP  97.5  96.1   1.4  95.7
 >>Mean>>  5 ZIIP   1.9    .0   1.9    .0
 >>Mean>>  6 ICF   60.2  57.9   2.3  11.7
 >>Mean>>  7 IFL   97.7  97.2    .4  88.1
 >>Mean>>  8 IFL   98.0  97.5    .5  88.4
 >>Mean>>  9 IFL   97.7  97.2    .5  87.9
 >>Mean>> 10 IFL   97.9  97.3    .6  87.5
 
 >>Mean>>  . CP    57.6  55.7   1.8  46.6
 >>Mean>>  . ZAAP  97.5  96.1   1.4  95.7
 >>Mean>>  . IFL   97.8  97.3    .5  87.9
 >>Mean>>  . ZIIP   1.9    .0   1.9    .0
 >>Mean>>  . ICF   60.2  57.9   2.3  11.7
 
Specialty Engines from a Guest Perspective

In a z/VM-mode LPAR, performance of an individual guest is controlled by the z/VM share setting, the CPUAFFINITY setting, the VCONFIG setting, and the virtual processor combinations.

The share setting for a z/VM guest determines the percentage of available processor resources for the individual guest. The share setting can be different for each virtual processor type or can be the same for each processor type. Shares are normalized to the sum of shares for virtual machines in the dispatcher list for each pool of processor type. Because the sum will not necessarily be the same for each processor type, an individual guest could get a different percentage of a real processor for each processor type. Although Performance Toolkit does not provide any information about the share setting by processor, it can be determined from the QUERY SHARE command or from z/VM monitor data Domain 4 Record 3. The total share setting for individual guests is shown in the Performance Toolkit UCONF screen.

Because some operating systems cannot run in a z/VM-mode partition, a new SET VCONFIG MODE command lets a user change a virtual machine's mode to one appropriate for the guest operating system. Valid modes are ESA390, LINUX, or VM. Use of an ICF in a z/VM-mode LPAR cannot be accomplished with the SET VCONFIG MODE; it requires OPTION CFVM in the directory. When OPTION CFVM is specified in the directory, the virtual configuration mode is automatically set to CF and cannot be changed by the SET VCONFIG MODE command.

The QUERY VCONFIG command can be used to display the virtual machine mode for all virtual machine types except CFVM. When a virtual machine becomes a CFVM, it no longer has the ability to issue a QUERY command, so even though CF is its virtual configuration mode, QUERY VCONFIG can never display the mode and thus product documentation does not list CF as a valid response. The INDICATE USER command will show the virtual machine as CF.

For a z/OS guest, the virtual configuration mode must be set to ESA390 with valid virtual processor types of CP, zAAP, and zIIP.

Coupling Facility guests require OPTION CFVM in the directory and the virtual configuration mode is automatically set to CF.

Linux guests are supported in all valid virtual configuration modes with all available processor types. However, not all processors available to the guest or to z/VM will be used. With a virtual configuration mode of ESA390, the guest will use only CP processors and thus be dispatched on only CP processors. With a virtual configuration mode of LINUX and virtual processor type of IFL, the guest will be dispatched on either CP or IFL processors depending on the CPUAFFINITY setting and the availability of real IFL processors. With a virtual configuration mode of LINUX and virtual processor type of CP, the guest will be dispatched on real CP processors. With a virtual configuration mode of VM, only virtual processors that match the primary processor will be used by Linux and they will be dispatched on real primary processors.

Because z/VM 5.4 virtualizes the z/VM-mode logical partition, a guest can be defined with a virtual configuration of VM when z/VM is running in a ESA/390-mode LPAR.

The overall processor usage for individual guests is shown in the Performance Toolkit USER screen but it does not show individual processor types.

Here is an example (run E8430VM1) excerpt of the Performance Toolkit USER screen showing the processor usage for the individual guests in the active workload. It shows the LINMAINT guest using nearly 4 processors but does not show that the processor type is IFL. It shows the ZOSCF1 and ZOSCF2 guests using slightly more than 1 processor but does not show that the processor type is CP. It shows the ZOS1 guest using slightly more than 1 processor but does not show that the processor type consists of 4 CPs and 1 zAAP. It shows the CFCC1 guest using 58% of a processor but does not show that the processor type is ICF.

FCX112  Run 2008/04/30 21:19:57         USER
                                        General User Resource Utilization
From 2008/04/30 21:00:03
 
           <----- CPU Load
                <-Seconds->
 Userid    %CPU  TCPU  VCPU  Share
 LINMAINT   389  4633  4187    100
 ZOSCF2     108  1283  1076    100
 ZOSCF1     107  1273  1045    100
 ZOS1       104  1242  1236    200
 CFCC1     57.9 689.0 139.3    100

The Performance Toolkit USER Resource Detail Screen (FCX115) has additional information for a virtual machine but it does not show processor type so no example is included.

For a z/OS guest, RMF data provides number and utilization of CP, zAAP, and zIIP virtual processors. The RMF reporting of data is not affected by the CPUAFFINITY setting but the actual values can be affected. Specialty Engine Support contains two examples to demonstrate the effect.

Although Performance Toolkit does not provide any information about the CPUAFFINITY setting, it can be determined from the QUERY CPUAFFINITY command or from a flag in z/VM monitor data Domain 4 Record 3.

Here is an example (run E8430VM1) excerpt of the RMF CPU Activity report showing the processor utilization by processor type for the ZOS1 userid with the JAVA workload active, a virtual configuration of 4 CPs, and 1 zAAP, and CPUAFFINITY ON.

The RMF-reported processor utilization for the zAAP processor type matches the z/VM-reported utilization because this is the only virtual zAAP in the active workload. The RMF-reported processor utilization for the CP processor type does not match the z/VM-reported utilization because other users in the active workload are using CP-type processors. The LPAR-reported utilization for this measurement is shown as an example in Performance Toolkit data, and the z/VM-reported utilization for this measurement is shown as an example in Performance Toolkit data.

                                       C P U  A C T I V I T Y
 
            z/OS V1R9                           DATE 04/30/2008
---CPU---    ---------------- TIME %-
NUM  TYPE    ONLINE     MVS BUSY
 0    CP     100.00      1.35
 1    CP     100.00      1.33
 2    CP     100.00      1.32
 3    CP     100.00      6.51
TOTAL/AVERAGE            2.63
 4    AAP    100.00     96.27
TOTAL/AVERAGE           96.27

Here is an example (run E8429FL2) excerpt of the Performance Toolkit PROCLOG screen showing the utilization of the individual processors and the average utilization by processor type. The active workload in this example is a Linux guest with a virtual configuration mode of LINUX, four virtual IFL processors, and CPUAFFINITY ON. The z/VM supporting this guest is running in a z/VM-mode LPAR with dedicated CP, zAAP, zIIP, IFL, and ICF processors. It shows nearly 100% utilization of the IFL processor and nearly zero on all the other processor types. This example shows the configuration that should be used for moving an existing LINUX only-mode IFL partition to a z/VM-mode partition and using real IFL processors.

 
FCX144  Run 2008/04/29 21:12:44         PROCLOG
                                        Processor Activity, by Time
 
                  <------ Percent Busy ----
           C
 Interval  P
 End Time  U Type Total  User  Syst  Emul
 >>Mean>>  0 CP     1.8    .0   1.8    .0
 >>Mean>>  1 CP     1.0    .0   1.0    .0
 >>Mean>>  2 CP     1.1    .0   1.1    .0
 >>Mean>>  3 CP     1.0    .0   1.0    .0
 >>Mean>>  4 ZAAP    .7    .0    .7    .0
 >>Mean>>  5 ZIIP    .8    .0    .8    .0
 >>Mean>>  6 ICF    1.6    .0   1.6    .0
 >>Mean>>  7 IFL   96.3  95.8    .4  86.6
 >>Mean>>  8 IFL   96.7  96.3    .4  87.2
 >>Mean>>  9 IFL   96.3  95.9    .4  86.7
 >>Mean>> 10 IFL   96.7  96.2    .4  86.9
 
 >>Mean>>  . CP     1.2    .0   1.2    .0
 >>Mean>>  . ZAAP    .7    .0    .7    .0
 >>Mean>>  . IFL   96.4  96.0    .4  86.8
 >>Mean>>  . ZIIP    .8    .0    .8    .0
 >>Mean>>  . ICF    1.6    .0   1.6    .0
 

Here is an example (run E8429FL3) excerpt of the Performance Toolkit PROCLOG screen showing the utilization of the individual processors and the average utilization by processor type. The active workload in this example is a Linux guest with a virtual configuration mode of LINUX, four virtual IFL processors and CPUAFFINITY ON (identical to the example in Performance Toolkit data). The z/VM supporting this guest is running in a z/VM-mode LPAR with dedicated CP, zAAP, zIIP, and ICF processors. Because there are no real IFLs, CPUAFFINITY will be suppressed and the virtual IFLs will be dispatched on CP processors. It shows nearly 100% utilization of the CP processors and nearly zero on all the other processor types. Nearly identical results would be expected in several other valid Linux scenarios, a LINUX IFL virtual configuration with CPUAFFINITY OFF, a LINUX CP virtual configuration, an ESA390 virtual configuration with a primary type of CP, a VM virtual configuration with a primary type of CP (this virtual configuration can include IFLs, but Linux will dispatch to only the primary CPU type).

FCX144  Run 2008/04/29 22:11:55         PROCLOG
                                        Processor Activity, by Time
 
                  <------ Percent Busy ----
           C
 Interval  P
 End Time  U Type Total  User  Syst  Emul
 >>Mean>>  0 CP    97.1  96.0   1.1  86.6
 >>Mean>>  1 CP    97.2  96.5    .7  87.0
 >>Mean>>  2 CP    97.0  96.4    .7  87.4
 >>Mean>>  3 CP    97.2  96.5    .7  87.3
 >>Mean>>  4 ZAAP   1.6    .0   1.6    .0
 >>Mean>>  5 ZIIP   1.5    .0   1.5    .0
 >>Mean>>  6 ICF    2.8    .0   2.8    .0
 
 >>Mean>>  . CP    97.1  96.3    .8  87.0
 >>Mean>>  . ZAAP   1.6    .0   1.6    .0
 >>Mean>>  . ZIIP   1.5    .0   1.5    .0
 >>Mean>>  . ICF    2.8    .0   2.8    .0
 

Summary and Conclusions

Results were always consistent with the speed and number of engines provided to the application. Balancing of the LPAR, z/VM, and guest processor configurations is the key to optimal performance. Merging multiple independent existing partitions with unique processor types into a single z/VM-mode partition requires careful consideration of the available processor types, and the relative speed of the processor types to ensure the optimum virtual configuration and CPUAFFINITY setting.

Contents | Previous | Next