IBM: VM/ESA 2.1.0 Performance Changes

VM/ESA 2.1.0 Performance Changes

Performance Improvements
Performance Considerations
Performance Management

Performance Improvements

CP Module Linkage Changes

In order to reduce overhead in the linkage of CP modules, changes were made in three areas for some frequent linkage cases. The first was to change some dynamic linkages to use fast dynamic linkage. Fast dynamic linkage was first introduced in VM/ESA 1.1.1, and is a more efficient method to do CP dynamic linkage.

The second area was to move some highly used modules from the pageable list to the resident list to save overhead. This increases the size of the resident nucleus slightly.

The last area was to avoid cutting trace entries for calls from HCPALLVM. This module is passed the address of a routine and then calls the passed routine for every VMDBK on the system. This is done for such functions as CP monitor high frequency user state sampling. By avoiding the trace entries, much overhead is saved and the trace table is not cluttered with less useful trace entries.

Improved VMCF Interrupt Processing

The amount of processing time required to handle VMCF interrupts has been reduced. The amount of improvement is directly related to the number of pending VMCF interrupts. If there are only a few pending interrupts, the effect is negligible. If there are hundreds of pending interrupts, the processing time per interrupt is reduced by about a factor of two. This improvement only applies to customers who are migrating from VM/ESA 1.2.1 or VM/ESA 1.2.2.

Virtual Channel to Channel Locking

The virtual channel to channel (VCTC) locking scheme has been improved. The new scheme uses a separate lock word for each VCTC instead of a global lock. This may improve VCTC throughput, especially in cases where a virtual machine is doing a high rate of requests through several VCTCs.

The global lock is still required in certain cases such as processing the COUPLE command.

CP Trace Table Default

The default calculation for the size of the CP trace table has been changed. This will ordinarily result in a smaller trace table for systems that take the default. As before, the trace table size can also be set explicitly.

In the past, the trace table size defaulted to 1/64th of available storage for each processor. With VM/ESA 2.1.0, that calculation is done for the master processor. However, if the result is greater than 100 pages, the master trace table size is set to 100. The trace table size for each alternate processor is then set to 75% of the size of the master processor trace table. Example savings for a 512MB 6-way system is over 46MB.

Most people feel that the new calculation represents a much better tradeoff between serviceability and real storage usage, especially considering that VM/ESA reliability has improved dramatically over the last several releases.

MDC Fair Share Limit

This improvement applies only to migrations from VM/ESA 1.2.2. It is also available on VM/ESA 1.2.2 as APAR VM59590.

Minidisk caching (MDC) has a fair share algorithm to prevent any one user from flooding the cache with data. This can be disabled with the NOMDCFS directory option. The fair share insert limit is dynamic but has a floor (minimum value). Analysis of benchmarks and customer data on VM/ESA 1.2.2 systems showed that the floor was too low and that this was degrading system performance. Accordingly, the old floor of 8 inserts per minute was increased to 150 inserts per minute.

The I/Os excluded due to the fair share limit being exceeded and the current fair share limit are reported by RTM (MDCACHE screen) and VMPRF (MINIDISK_CACHE_BY_TIME report). If, on your current system, this information shows that a significant number of I/Os are being excluded from minidisk caching due to the fair share limit being exceeded, the system may benefit from this change.

Extended Record Cache Support

Prior to VM/ESA 2.1.0, VM/ESA supported guest-use of the Record Cache I function in the 3990-6 storage control. This was for guest operating systems that built their own channel programs. Record Cache I support is now extended to users of DIAGNOSE X'A4', DIAGNOSE X'250', and the block-I/O facility under certain conditions. When VM knows that data being written meets the control unit's criteria for "regular data format", VM sets a channel-program indicator to achieve improved performance for I/O requests issued through these I/O-service facilities. I/O requests must meet all of the following conditions:

The request is to write data.
The request is eligible to be stored in VM's minidisk cache.
The request is for a track that was previously read and found to be in standard format. (1)

The record-cache function requires the DASD Fast Write (DFW) function to be enabled and adds to the performance benefits of this function. With DFW, the control unit completes the I/O request almost immediately. The host need not wait for the data to be written (destaged) to the DASD volume, since the data is protected from loss by residing in the control unit's nonvolatile storage (NVS) while waiting to be destaged. However, if the record is not already in the control unit's cache when an update is received from the host, the entire track must be staged (read) into the cache before the I/O request can complete. The additional performance advantage of the record-cache function is that staging of the track containing the record can be avoided. Because VM tells the control unit that the record being written is in a standard format, the control unit knows that the record will fit within the existing format of the track when the record is ultimately destaged from the NVS.

Improved LOCATEVM Command

The LOCATEVM CP command (class G) can use a very large amount of processor time when a large search range is given. Because of this, use of this command can adversely impact system performance. As a precaution, some installations have chosen to reassign this command to a more restrictive class.

In VM/ESA 2.1.0, LOCATEVM processor requirements have been substantially reduced (by 75% in one test). While LOCATEVM can still use a significant amount of processor and paging resources, it is now less risky to leave it available to class G users.

Compiled REXX for CMS

Most of the CMS REXX execs and XEDIT macros on the S-disk are now shipped as compiled REXX files. This includes all files (except SYSPROF EXEC) that are in the CMSINST shared segment and a number of others. They make use of a subset REXX run-time library that is shipped with VM/ESA 2.1.0.

Some of the CMS execs (most notably FILELIST, RDRLIST, and PEEK) are written in EXEC2. They remain in EXEC2 and are not affected by this change.

This change can significantly improve the performance of CMS intensive workloads that use REXX-implemented CMS functions such as DIRLIST, DISCARD, NAMES, NOTE, RECEIVE, SENDFILE, TELL, and VMLINK, as well as Xedit macros such as ALL and SPLTJOIN. Processor capacity improvements exceeding 6% have been observed.

The uncompiled source files are provided on the S-disk for customers who wish to make modifications. Customers with the REXX compiler are advised to recompile the updated files before placing them back onto CMSINST so as to retain the performance advantages.

CMS Nucleus Restructure

The CMS nucleus was restructured in VM/ESA 2.1.0. This improved performance in a number of ways:

More of the CMS code has been moved above the 16MB line. This can improve performance by allowing more use of SAVEFD and by allowing more shared segments to be created that require space below the 16MB line.
- The CMS shared system now starts at X'F00000' instead of X'E00000'.
- NLS language repository segments can now reside above the 16MB line.
In prior releases, the default installation procedure placed the VMLIB, VMMTLIB, and PIPES segments below the 16MB line (even though they can be run above the line) because of the possibilty that CMS could be run in a 370 mode machine. With VM/ESA 2.1.0, default installation puts these segments above the 16MB line. (VMMTLIB has been integrated into the portion of the CMS saved system that is above the line.) This change frees storage that was previously taken below the 16MB line.
CMS now allocates some of its control blocks (such as the IUCV path table) above the 16MB line when such space is available.
The 370-mode code has been removed from the mainline paths in CMS.
The fast path through the SVC interrupt handler (DMSITS) has been further optimized.
The following modules have been moved from the S-disk back into the CMS nucleus:
- DMSQRC - query COMDIR
- DMSQRE - query ENROLL
- DMSQRF - query CMS (window manager)
- DMSQRG - query CMS (window manager)
- DMSQRH - query CMS (window manager)
- DMSQRN - query NAMEDEF
- DMSQRP - query FILEPOOL
- DMSQRQ - query LIMITS, FILEWAIT, RECALL
- DMSQRT - query AUTOREAD, CMSTYPE, and so forth
- DMSQRU - query FILEDEF, LABELDEF
- DMSQRV - query INPUT, OUTPUT, SYNONYM
- DMSQRW - query libraries (MACLIB, and so forth)
- DMSQRX - query DOS, DOSPART, UPSI, DLBL
- DMSSEC - set COMDIR
- DMSSEF - set CMS (window manager)
- DMSSML - set/query MACLSUBS
This can benefit the performance of workloads that use these functions if they had not previously been used from a shared segment.
In VM/ESA 1.2.2, these modules resided in the CMSQRYL and CMSQRYH logical segments. These segments no longer exist.

CMS Pipeline Stages in Assembler

CMS Pipelines now provide assembler macros that perform basic pipeline functions and are the building blocks for writing assembler stage commands. User-written assembler stage commands provide increased performance over similar stage commands written in REXX.

Data Compression Support

VM/ESA 2.1.0 includes data compression API support so vendors and customers can more easily create applications that exploit the use of compression services. Both a macro interface (CSRCMPSC) and a CSL interface (DMSCPR) are provided. Use of this support can save DASD space, tape storage space, and transmission line costs. The increase in processing time associated with data compression and expansion is greatly reduced on processors that have hardware compression (CMPSC instruction).

In addition, CMS and GCS support the VSE/VSAM Version 6 Release 1.0 interface for data compression. Using the COMPRESS parameter of the DEFINE function will cause VSAM to automatically expand or compress data during a VSAM read or write operation, respectively. When available on the processor, the CMPSC instruction is used for this purpose. CMS and GCS system users can read and write to VSAM files that have been compressed under the control of the VSE/VSAM program. No application program changes are necessary.

SUPERSET Xedit Subcommand

This new XEDIT subcommand performs the same function as the existing SET subcommand. However, it can be used to set multiple options in one invocation. The following CMS Productivity Aids were changed to use this subcommand: FILELIST, RDRLIST, NOTE, SENDFILE, PEEK, DIRLIST, and the EXECUTE XEDIT macro. It can also be used to improve the performance of user-written applications that include performance-sensitive XEDIT macros.

GCS PAGEX Support

You can now make use of the CP PAGEX facility with GCS. PAGEX is specified on a virtual machine basis. When PAGEX is ON and a given GCS task takes a page fault, GCS will dispatch other active GCS tasks in the virtual machine while waiting for that page fault to be resolved. This can result in increased capacity for that virtual machine to do work.

PAGEX is especially useful in cases where a virtual machine has a large number of GCS tasks and these tasks are active on an intermittent basis. A good example would be an RSCS machine with many line drivers.

If this is not the case, SET RESERVE remains the best method to minimize the effects of paging. SET RESERVE works best when the virtual machine's reference pattern has good locality of reference and its working set size does not change much over time. In intermediate cases, the best tuning solution might be to use a combination of PAGEX ON and SET RESERVE. SET RESERVE would be used to protect the most frequently used pages, while PAGEX ON would be used to keep those page faults that do occur from serializing the whole virtual machine.

PAGEX is not recommended for the VTAM machine because most VTAM execution is on one GCS task.

GCS SET TSLICE Command

In prior releases, the GCS time slice was fixed at 300 milliseconds. With VM/ESA 2.1.0, 300 milliseconds is retained as the default setting but this can be altered for any given virtual machine in a GCS group by using the new SET TSLICE GCS command.

A smaller time slice setting can be used to help avoid time-out situations when multiple tasks are involved. You can estimate whether the default time slice setting is likely to result in a time-out situation. For example, if the QUERY TSLICE command shows 100 active tasks, the maximum delay before a given task is run is 100 times 0.300, or 30 seconds. If 30 seconds is the line time-out limit, you should set the time slice lower.

Note: Setting the time slice lower than it needs to be will tend to increase GCS dispatching overhead.

VMSES/E - VMFBLD and VMFCOPY Improvements

The performance of the VMFBLD function has been improved. Elapsed time and processor time reductions exceeding 20% have been observed. This improvement was first introduced through VM/ESA 1.2.2 APAR VM57938.

The performance of VMFCOPY has been improved by providing an SPRODID option. In prior releases, all files that met the fn ft fm criteria were copied regardless of what product they belonged to. When the SPRODID option is specified, only those files that belong to the specified product are copied.

Performance Considerations

Minidisk Cache and VM Data Space Mapped Minidisks

A minidisk that is mapped to a VM data space should have minidisk cache disabled. Support for improved data integrity was introduced in this release for a unique scenario. If a minidisk is mapped to a VM data space and is also eligible for minidisk cache, CP will now attempt to flush minidisk cache whenever the Save function is invoked for the mapped minidisk. This can result in significant CPU overhead. There is typically no performance benefit from using minidisk cache for a mapped minidisk.

Additional CMS Paging Space Requirements

The number of pages that are referenced during IPL CMS but are (typically) unused thereafter has increased by about 12. This increases DASD paging space requirements to some extent. Since these referenced pages must ultimately be paged out, they can also reduce performance in situations where large numbers of CMS users are logging on over a short period of time.

Many additional virtual pages in the user's virtual machine are referenced when the OpenExtensions environment is initialized. This occurs implicitly when the first OpenExtensions request is made. Nearly all of these additional pages are no longer referenced if there are no subsequent OpenExtensions requests. However, these pages will add to the number of occupied page slots on DASD. This leads to the following two recommendations:

If many users are (even occasionally) using the OpenExtensions environment, take a look at whether the system's page space is still sufficient.
Do not put OpenExtensions-oriented commands such as OPENVM MOUNT in your PROFILE EXEC unless you will normally be using OpenExtensions functions subsequent to starting CMS.

CMS Working Set Size Increase

CMS references more non-shared pages than it did in VM/ESA 1.2.2. This will tend to increase paging, especially in storage-constrained environments with large numbers of CMS users.

Potential Overlap of CMS with Shared Segments

In VM/ESA 1.2.2, the CMS saved system occupied megabytes E, F, and 10. In VM/ESA 2.1.0, it occupies megabytes F, 10, 11, and 12. If your installation has defined any shared segments in megabytes 11 or 12, they will need to be moved in order to avoid overlapping CMS.

Performance Management

Monitor Enhancements

A number of new monitor records and fields have been added. Some of the more significant changes are summarized below. For a complete list of changes, see the MONITOR LIST1403 file for VM/ESA 2.1.0. See for information about this file.

User State Sampling
A number of changes were made to improve the usefulness of the user state sampling data.
- Users doing diagnose I/O used to show up as being in simulation wait. They now appear in I/O wait.
- Users in CP SLEEP or CP READ used to be shown as being in console function mode wait. They now appear as idle.
- A new state, active page wait, has been added for virtual machines that have a page request outstanding but can handle it with PAGEX or asynchronous page fault handling.
CP Configurability II
The CP Configurability II support allows I/O devices to be added or removed from the I/O hardware configuration while VM/ESA is running. In order to track these changes, several new event I/O domain records were added, such as the delete device record (domain 6 record 15, D6/R15).
A measurement block, sometimes referred to as a subchannel measurement block, is a control block that is associated with a given device. It contains measurement data for the device such as the I/O rate and timing values for the various components of service time. The hardware is responsible for updating this information. From the measurement block information, performance products can compute the device's service time, I/O rate, and utilization. With the CP Configurability II support, it is now possible for a given device to not have an associated measurement block. Accordingly, information has been added to the monitor to indicate when this is the case.
The new SET SCMEASURE command allows an administrator to enable or disable the collection of subchannel measurement data for a specific device or range of devices. An event record is created each time the SET SCMEASURE command is used.
SET THROTTLE
Monitor fields has been added in support of the new SET THROTTLE command. This includes:
- whether a device has been throttled
- the throttle rate for a device
- the number of times I/O was delayed on a given device
- the number of times a given user had I/O delayed due to throttle
RAMAC Support
Monitor support for RAMAC is available for VM/ESA 1.2.1 and VM/ESA 1.2.2 through development APAR VM59200. This support is integrated into VM/ESA 2.1.0. Since RAMAC DASD appear to VM as either 3380s or 3390s, additional fields have been added to the device configuration data record (D1/R6) and the vary on device record (D6/R1) to indicate the actual DASD and control unit type where possible. Cache activity data records (D6/R4) have been made available for the RAMAC subsystem.
SFS APPLDATA
The APPLDATA domain monitor data contributed by SFS filepool servers has been extended to include counts and timings that pertain to the byte file system. These include byte file request counts for each type of request, lock conflict counts for each type of byte file lock conflict, and token callback information.
CMS Multitasking APPLDATA
CMS multitasking can contribute application data to the monitor in the APPLDATA (10) monitor domain. This includes the following information:
- Thread creation and deletion counts and timings
- Thread switch rates
- Number of blocked threads
- Highest number of threads and POSIX processes in use.

Dynamic Allocation of Subchannel Measurement Blocks

The I/O service times and related information for a device are computed from data found in its associated subchannel measurement block, which the hardware is responsible for updating. With the new functions provided by CP Config II, there can now be scenarios where there is not a subchannel measurement block associated with a device. In such cases, the service times and related data are not available and are shown as zeros in the monitor data.

SET THROTTLE Command

This is a new CP command that can be used to set a maximum rate at which the system's virtual machines, in aggregate, are permitted to initiate I/Os to a given device. This limit does not apply to I/Os initiated by CP. CP converts the specified rate into an interval representing the minimum time that must pass after one I/O is started before the next I/O to that device can start. If CP receives an I/O request to a device that has been limited by SET THROTTLE, that I/O request is delayed, if necessary, until the minimum time interval has completed.

In multi-system configurations which have shared channels, control units, or devices, SET THROTTLE can be used to help prevent any one system from overutilizing the shared resources.

QUERY FILEPOOL Command Extensions for BFS

Information has been added to the QUERY FILEPOOL commands to provide byte file system performance information. In particular, byte file system counts and timings have been added to QUERY FILEPOOL REPORT and its subset, QUERY FILEPOOL COUNTER.

Effects on Accounting Data

The following list describes fields in the virtual machine resource usage accounting record (type 01) that may be affected by performance changes in VM/ESA 2.1.0. The columns where the field is located are shown in parentheses.

Milliseconds of processor time used (33-36)

This is the total processor time charged to a user and includes both CP and emulation time. For most workloads, this should not change much as a result of the changes made in VM/ESA 2.1.0. Exception: CMS intensive workloads that make significant use of DIRLIST, DISCARD, NAMES, NOTE, RECEIVE, SENDFILE, TELL, and VMLINK, and/or Xedit macros such as ALL and SPLTJOIN. Such workloads can experience a significant reduction in total processor time arising from CMS's use of compiled REXX. Most of this decrease will be virtual processor time.

Milliseconds of virtual processor time (37-40)

This is the virtual time charged to a user. See the above discussion of total processor time.

Requested virtual nonspooled I/O starts (49-52)

This is a total count of requested starts. All requests may not complete. The value of this field could change, depending on the system I/O characteristics, because of two changes made to CP:

In previous releases, this counter was incremented for each real I/O done. This included the scenario where CP splits a virtual I/O into a separate real I/O for each cylinder involved. In VM/ESA 2.1.0, this counter will be incremented only once per virtual I/O.
In the past, virtual I/Os eligible for minidisk caching, but not satisfied from the minidisk cache, were not always being counted. This has been corrected.

Completed virtual nonspooled I/O starts (73-76)

This is a total count of completed requests. The previous discussion of "requested virtual nonspooled I/O starts" also applies to this field.

VM Performance Product Considerations

VM Performance Reporting Facility 1.2.1 (VMPRF) requires APAR VM59656 (PTF UM27312) to run on VM/ESA 2.1.0. VMPRF at this service level includes the following functional enhancements:

3990-6 cache controller support
RAMAC support
LE/370 support
page active and limit list data have been added to the state sample reports

Realtime Monitor VM/ESA 1.5.2 (RTM/ESA) requires APAR GC05374 (PTF UG03792) to run on VM/ESA 2.1.0. It can be run on CMS11 (or earlier) in 370 mode or on CMS12 in XA mode with CMS370AC on. If it was built on CMS12 and is run on CMS12, it will set CMS370AC on. RTM/ESA at this service level can be built using HLASM Release 2. Support for RAMAC DASD has been added.

Performance Analysis Facility/VM 1.1.3 (VMPAF) will run on VM/ESA 2.1.0 with the same support as VM/ESA 1.2.2.

Footnotes:

(1) Minidisk caching considers a track to be in standard format when it meets certain criteria. For example, all DASD records on the track must have the same length and an integral number of these records must fit into a 4KB page. There are other criteria as well; for a complete definition of standard format, see the minidisk caching chapter in VM/ESA CP Diagnosis Reference.

Back to the Performance Changes Page