VM/ESA 1.1.1 Performance Changes
- Performance Improvements
- Pending Page Release
- CP Fast Dynamic Linkage
- In-Line Page Table Invalidation Lock
- Minidisk Cache Spin Lock Fix
- IUCV Improvements
- Block Paging Improvements
- Default DSPSLICE
- VM Data Spaces
- SFS Performance Improvements
- Callable Services Library (CSL Loaded above t 16MB Line
- Disconnect/Reconnect Handskaking
- Coordinated Resource Recovery Logical Unit of 16MB Line
- Performance Considerations
- Performance Management
Prior to VM/ESA 1.1.1, CMS issued DIAGNOSE code X'10' to release pages of storage. CP would reclaim host resources that had been required to support these pages. Because of the nature of the CMS Storage Manager and applications, these pages (virtual frames) are often requested for re-use in a short period of time. This behavior resulted in significant overhead in managing associated host resources.
Starting in VM/ESA 1.1.1, CMS uses DIAGNOSE code X'214' in the management of page release. DIAGNOSE code X'214' provides functions to establish or cancel the pending release for a range of pages. This allows CP to delay or omit processing to reclaim host resources. A storage key option is also provided in the DIAGNOSE code X'214' functions. Unlike DIAGNOSE code X'10', DIAGNOSE code X'214' is not for general use. It is documented only in the CP Diagnosis Reference manual. The CMS SET RELPAGE OFF command is still respected.
This improvement addresses the majority of CMS environments. The use of DIAGNOSE code X'214' improves performance as follows:
- Reduces overhead in CP for page release.
- Minimizes first time page faults.
- Reduces the number of PTLB (purge translation-lookaside buffer) instructions.
- Allows CMS to avoid expensive SSKE instructions through use of the storage key option. This also eliminates the need for CMS to reference every page at IPL time, thus eliminating the creation of associated page tables at that time.
DIAGNOSE code X'214' is included in the "Count of IBM supplied DIAGNOSE instructions" for each processor in the CP monitor (Domain 0 Record 1). Performance management tools such as VMPRF and RTM/ESA show the rate of various diagnose instructions. If the rate for DIAGNOSE code X'214' is zero or very low, check that both CMS and CP are Version 1 Release 1.1 or higher and that SET RELPAGE OFF has not been issued.
In VM/ESA 1.1.0, two forms of CP linkage exist: static and dynamic. Static linkage is efficient, but the implementation is more involved. Static linkage requires hardcoded, preallocated save areas. Static linkage is only used for a small number of frequently called entry points. Dynamic linkage is much less efficient, but easy to program with. Analysis showed that VM/XA and VM/ESA 1.1.0 (ESA Feature) systems spend a considerable amount of time in HCPSVC, which handles dynamic linkage. Fast dynamic linkage addresses this area. It is somewhere in between static and dynamic linkage, being relatively efficient and easy to work with.
There are some restrictions associated with fast dynamic linkage. These are:
- The called module must be resident.
- Fast dynamic linkage cannot be used by a multiprocessor module to call a non-multiprocessor module (master only module).
The following factors lead to the performance improvement:
- Code is in line and therefore avoids a costly call to HCPSVC. Note that there is an option to use HCPSVC.
- No check is made to determine if the module is pageable at execution.
- No check is made to determine if the module is master only at execution.
Also, the cross processor return queue for save blocks is not used with fast dynamic linkage. This is not a problem since any temporary imbalance is corrected by normal dynamic linkage use.
Approximately 40 entry points from a total of 20 key modules were converted to use fast dynamic linkage in VM/ESA 1.1.1. These can be determined by looking at the HCPMDLAT MACRO.
VM/ESA 1.1.0 introduced the ability to page Page Management Blocks (PGMBKs). At that time, the page table invalidation lock (VMDPTIL) was made a formal lock managed by the HCPLCK module. Due to resources required to manage a formal lock and the frequency at which the lock was obtained/released, it was shown to be a potential area for significant improvement. This item implemented efficient in line macros to handle the most frequent scenarios.
The original change in VM/ESA 1.1.0 to a formal lock also made holding the VMDPTIL a critical process. For each virtual machine, a count of critical processes is kept. These are meant to represent locks or resources held that are critical to the system performance. When a virtual machine's critical process count is non-zero, CP temporarily gives this virtual machine special treatment to keep it from impacting other virtual machines. This is known as a lock shot. In VM/ESA 1.1.1, VMDPTIL is no longer considered a critical process and now avoids the associated overhead.
In environments with large amounts of expanded storage being used heavily by minidisk caching, the potential existed for sporadic periods of very high lock spin time. Minidisk cache management requires certain system locks when reorganizing its hash table structure. This caused high lock spin time spikes. The problem is not always obvious by looking at the average spin time because the minidisk cache management processing is only periodic. The severity of the problem is proportional to the number of processors on the system, the degree of minidisk cache activity, and the size of the minidisk cache.
Changes went into the minidisk cache management processing to eliminate this problem in VM/ESA 1.1.1. These changes are also available to previous releases as corrective service.
This includes both IUCV and APPC/VM. With the growing reliance on server virtual machines, the need for efficient communication functions grows. IUCV and APPC/VM were shown to be more expensive in VM/ESA than in the 370 based CPs. These factors led to a focus on improving IUCV performance.
Several changes led to the improved performance:
- Storage management for MSGBKs and IUSBKs was made more efficient. Semi-permanent control blocks and stack management were exploited.
- Several high frequency entry points were converted to fast dynamic linkage (previously mentioned).
- The processing of external interrupts was optimized.
The efficiency of block paging is decreased when the optimal blocking factor cannot be used. A number of things that used to prevent this or "break" the block have been removed:
- Blocks not broken on segment faults.
- Blocks not broken when available list is empty.
- Blocking up to 64 pages.
- Blocking as large as the virtual machine specifies using REFPAGE CP macro. REFPAGE is new and involves VM Data Spaces. It deals with an application giving CP hints about the virtual machine page references.
When a VM/ESA system is IPLed, CP initialization logic uses a timing loop to determine the default dispatching minor time slice. This is an attempt to determine an appropriate value for the speed of the processor VM is running on. Analysis and experimentation showed that the default calculation resulted in a less than optimal value for several high end processors. Further analysis showed that a floor of 5 milliseconds provided improved internal throughput rate for the affected processors.
In VM/ESA 1.1.1, the initialization logic is the same for the calculation of the default dispatch slice, except an additional check is added. If the computed default value is less than 5 milliseconds, it is changed to be 5 milliseconds. The value can still be changed using the SET SRM DSPSLICE. The range of acceptable values stays the same (1 to 100 milliseconds). The current setting can be determined with the QUERY SRM DSPSLICE command.
The benefit associated with this change is dependent on the workload and processor. The ITR improvements are a result of less CP resource. This occurs when fewer time slices are required per user transaction, and thus less CP dispatcher processing. For example, if the old value was 2 milliseconds and the majority of transactions required 3 milliseconds to complete, the new value (5 milliseconds) would allow these transactions to complete in a single dispatch time slice.
However, there are some other considerations to this change. Most measurements also showed improved response time corresponding to internal throughput/transaction rate change. The exceptions were systems with remote users connected using VTAM and VSCS. In these environments, it is believed that the increased dispatch time slice made VTAM less responsive, which resulted in response time staying the same or becoming somewhat worse.
In addition to the response time factors, the following need to be kept in mind:
- Impact to explicit tuning where the dispatching minor time slice is a factor. For example, the SET SRM IABIAS command parameters are related to the minor time slice.
- Impact to run away or CPU intensive users since these virtual machines may run for longer periods of time before CP gains control.
VM Data Spaces provide increased storage addressability and therefore can move the burden of I/O from an application to CP. The use of VM Data Spaces also extends the concept of sharing data. This has two chief advantages:
- It reduces storage requirements. One copy can be shared among many virtual machines instead of a copy per virtual machine.
- It reduces the need to transfer data by IUCV, APPC/VM, or some other communication vehicle.
In VM/ESA 1.1.1, SFS performance was enhanced through two changes. The first, and more significant, is the exploitation of VM Data Spaces. The second is a change in the checkpoint processing for SFS.
SFS exploits several of the VM Data Space features to provide improved performance for DIRCONTROL directories containing read-mostly data. SFS exploitation improves performance by avoiding file pool server requests and by the sharing of data within the data space. The data space is owned and maintained by the file pool server.
The actual contents of the data space are as follows:
- The part of the Active Disk Table (ADT) control block that can be shared.
- File Status Table (FST) control blocks. Previously, only EDF minidisks could share FSTs. (EDF accomplishes this by using saved segments created with the SAVEFD command.)
- File data blocks. The minidisk mapping functions are used to maintain these.
Checkpoint processing, a normal part of managing the SFS logs, causes serialization of the file pool server. This serialization can be unpleasant, because it affects the users response time to for the servers.
This improvement doubled the interval between checkpoints. It is now done every 100 filled log pages as opposed to 50 in VM/ESA 1.1.0. This improves response time. However, because checkpoints are relatively infrequent, there is no significant reduction in I/Os or processor usage.
Callable Services Library (CSL) provides many routines callable from high level languages. It is loaded at IPL time by the SYSPROF EXEC using the RTNLOAD command. It is also located in a user's virtual storage, occupying more than 350KB in VM/ESA 1.1.0. This led to an increase in "virtual storage exhausted" messages. The capability exists to use a saved segment and add SEGMENT LOAD to SYSPROF prior to the RTNLOAD command, allowing a shared copy of CSL to be used. However, prior to VM/ESA 1.1.1, CSL only ran below the 16MB line and many sites do not have much room in that area. Therefore, in VM/ESA 1.1.1 enhancements were made so CSL can run above the 16MB line.
Users expect CMS to be able to handle the scenario where they disconnect from one terminal and reconnect on a different size terminal. CMS is expected to adjust to the new screen size. In the past, CMS was constantly issuing DIAGNOSEs for terminal characteristics in order to accomplish this. This is costly in terms of CP processing associated with the DIAGNOSE handling and the SIE breaks that were caused. Now, CP and CMS shake hands using the new DIAGNOSE code (X'264'). DIAGNOSE code X'264' is not for general use. During CMS IPL processing, CMS will issue the DIAGNOSE to inform CP of a communication area. A flag is established that CP will use to indicate that CMS needs to redetermine screen characteristics. CMS merely checks the flag in virtual storage instead of continuously issuing DIAGNOSEs.
This was really a positive side effect of a general change for SFS. It was determined that there were some obscure situations in which the system could get into an undetectable deadlock with SFS. These scenarios were quite complex. In order to resolve these, the SFS file pool server needed to be passed a global LUWID (logical unit of work identifier) from the end user for each file pool request. The CRR (Coordinated Resource Recovery) server manages these LUWIDs. In the past, CMS code in the end user virtual machine would request a LUWID from the CRR server and a single LUWID was passed back.
The additional CRR server requests would have been a performance problem. Therefore, LUWID processing was changed to have the CRR Recovery server return multiple LUWIDs (255). The impact to normal SFS regression performance is negligible. However, for CRR exploitation cases this results in a significant performance improvement. A CRR server request to get an LUWID is needed only once every 255 commits instead of for every commit.
A large amount of new function has gone into CMS in the past few releases. That new function has also led to a growth of the CMS nucleus. In fact, the rate of growth has also increased. In VM/ESA 1.1.0, the growth resulted in some installations not having enough room for the S and Y Stats (saved FSTs for the S and Y disks). This can lead to performance problems. In VM/ESA 1.1.1, continued growth would have led to CMS requiring an additional segment in the already crowded area below the 16MB line. Two key changes were made to address this. The first was in the management of the CMS message repository. The second was moving some CMS code from the nucleus to the S-disk.
Analysis showed that the greatest single source of growth was the CMS message repository. There were messages associated with all the new function that had been going into CMS. In VM/ESA 1.1.1, the repository now starts at the 16MB line. Access for XA and XC mode is straightforward. In 370 mode, a version of DIAGNOSE code X'248' (copy-to-primary) is used. For all modes, the management of messages was enhanced to provide true caching. In the past, some key messages were cached by hard coding them to avoid message repository processing.
To further reduce the size of the CMS nucleus, some commands were removed from the nucleus and placed as modules on the S-disk. (This change was introduced in VM/ESA 1.1.0 by APAR VM49762 and is included in the VM/ESA 1.1.1 base.) A total of sixteen modules were moved. While an attempt was made to ensure that performance sensitive modules were not removed from the nucleus, some environments may require the use of a subset of the commands. Invoking a module residing on the S-Disk results in it being loaded into the end user's virtual storage as a nucleus extension. This storage is not shared and therefore can cause performance degradation in storage constrained environments because of the increase in user working set size and system paging.
One of the steps that can be taken to offset this effect is to place some or all of these modules into a logical shared segment, thus allowing all users of these modules to share a single copy. For more information on this, see the VM/ESA: Performance book.
A number of new monitor records and fields have been added. Some of the more significant changes are summarized below. For a complete list of changes, see the MONITOR LIST1403 file for VM/ESA 1.1.1. For information about the content and format of the monitor records file, see the VM/ESA: Performance book.
- VM Data Spaces
New monitor event records are generated when an address space is created (D3/R12) or deleted (D3/R13).
A new sample record (D3/R14) provides paging information for each shared address space.
- DASD Fast Write - An existing sample record (D6/R4) contains applicable subsystem performance data as returned by the Perform Subsystem Function command.
- Integrated Cryptographic Facilities
Configuration information has been added to D1/R5.
There are new monitor event records for adding (D5/R6) or removing (D5/R7) access to ICRF.
Sample and event user data have been added to D4/R3 and D4/R9 to count virtual CPU re-dispatches because of a crypto operation exception.
- SSCHs Avoided Due to Minidisk Caching
SSCHs avoided for a given DASD volume has been added to D6/R3.
SSCHs avoided per user has been added to the D4/R3 sample data and the D4/R9 event data.
- I/O assist information has been added to D6/R3.
- A count of SSCHs for block I/O by a given user has been added to the D4/R3 sample record and the D4/R9 event record.
This new class E CP command can be used to display the number of users in the dispatch, eligible, and dormant lists that were active (at any time) during a specified time interval.
Five new options (PAGE, SPOOL, TDISK, DRCT, and MAP) have been added to the QUERY ALLOC command to provide additional information and additional data subsetting capabilities. Total and allocated space was formerly shown in units of cylinders. This is still the default, but through the use of these new options, the number of total and allocated pages can be displayed.