Contents | Previous | Next

Storage Management Improvements

Abstract

z/VM 6.2 provides several storage management serialization and search enhancements. Some of these enhancements are available through the service stream on previous releases.

These enhancements help only those workloads that were adversely affected by the particular search or serialization. Therefore, they do not provide uniform benefit to all workloads. However, none of the enhancements cause any significant regression to any measured workload.

Of all the enhancements, VM64774 SET/QUERY REORDER is the only one that introduces any new or changed externals. Our tips article Reorder Processing contains a description and guidelines for using the new SET REORDER and QUERY REORDER commands.

Introduction

This article addresses several serialization issues within the z/VM storage management subsystem that can result in long delays for an application or excessive use of processor cycles by the z/VM Control Program. These serializations generally involve spinning while waiting on a lock or searching a long list for a rare item. They all can cause long application delays or apparent system hangs.

VM64715 changed page release serialization to reduce exclusive lock holding time. This reduces long delays during page release. Applications most affected by this generally involved address space creations and deletions.

VM64795 and VM65032 changed the page release function to combine all contiguous frames as pages are released. This reduces long delays while the system is searching for contiguous frames.

z/VM 6.2 eliminates elective use of below-2-GB storage in certain configurations or environments when doing so would not harm the workload. This reduces long delays incurred while the system is searching for a below-2-GB frame.

VM64774 introduced the command CP SET REORDER OFF to suppress the page reorder function. This lets the system administrator reduce long delays during reorders of guests having large numbers of resident pages.

Monitor records D0 R3 MRSYTRSG Real Storage Data (Global) and D3 R1 MRSTORSG Real Storage Management (Global) have been updated. For more information see our data areas and control blocks page.

Background

Identifying Potential Search Conditions

Performance Toolkit for VM can be used to identify certain conditions that might cause an application delay due to long searches.

Here is an example (run ALSWA041) of the Performance Toolkit MDCSTOR screen showing 412 MDC pages below 2GB, 1066000 MDC pages above 2GB, and a non-zero steal rate. This illustrates a system that is exposed to long searches in trying to recover below-2-GB frames from MDC.

FCX178   MDCSTOR
         Minidisk Cache Storage Usage, by Time
________________________________________
 
           <---------- Main Storage Frames
 Interval   <--Actual--->     Steal
 End Time     <2GB   >2GB  Invokd/s
 >>Mean>>      412  1066k     3.353

Here is an example (run ST6E9086) of the Performance Toolkit UPAGE screen showing a user with 38719 pages below 2 GB, 3146000 pages above 2 GB, and a non-zero steal rate. This illustrates a system that is exposed to long searches in trying to recover below-2-GB frames from a user.

FCX113  Run 2011/10/17 13:49:50         UPAGE
                                        User Paging Activity
_____________________________________________________________
 
             Page     <-Resident-> <--Locked-->
 Userid    Steals     R<2GB  R>2GB L<2GB  L>2GB  XSTOR   DASD
 CMS00007    3253     38719  3146k     0      0      0  1257k

Here is an example (run ST6E9086) of the Performance Toolkit PROCLOG screen showing percent system time of 36.3%. High system utilization is another indicator of system serialization or searching.

FCX144  PROCLOG
        Processor Activity, by Time
 
                  <------ Percent Busy -->
           C
 Interval  P
 End Time  U Type Total  User  Syst  Emul
 
   Mean    . CP    40.1   3.7  36.3   1.6

Method

The VM64715 page release serialization change to reduce exclusive lock holding time was evaluated using a specialized workload to create and destroy address spaces.

The VM64795 and VM65032 function to combine all contiguous frames as pages are released was evaluated using a specialized storage-fragmenting workload.

The z/VM 6.2 change to eliminate elective use of below-2-GB storage in some situations was evaluated using Virtual Storage Exerciser Tool and Apache to create specialized workloads that would exercise known serialization and search conditions.

Table 1 contains the configuration parameters for the Virtual Storage Exerciser Tool.

Table 1. Configuration parameters for the Virtual Storage Exerciser Tool

Characteristic Value
Real memory 120 GB
Xstor 0 GB
Processors 4
Guests 8
Guest virtual memory 128 GB
Guest virtual processors 3

Table 2 contains the configuration parameters for the Apache workload.

Table 2. Configuration parameters for non-paging Apache workload

Characteristic Value
Real memory 64 GB
Xstor 2 GB
Processors 3
Number of clients / virtual processors 3 / 1
Number of servers / virtual processors 6 / 1
Number of HTML files 10,000
Note: * Includes clients and servers

Pre-measurement setup included prereading so as to place the static HTML files into the server Linux file cache

Results and Discussion

Page Serialization Enhancements

The specialized workload to evaluate the page serialization enhancements does not have any specific throughput metrics. Its only measure of success is less wait time and higher utilization for the application.

Contiguous Frame Coalesce

The specialized workload to evaluate contiguous frame coalesce at page release does not have any specific throughput metrics. However, system utilization decreased more than 50% and virtual utilization increased more than 300%.

No MDC Pages Below 2GB

Table 3 contains results for an Apache workload. Eliminating below-2-GB usage for MDC reduced system utilization 91% and provided an 18% improvement in throughput.

Table 3.

Metric ALSWA041 ALSWA040 Delta Pct
CP level (p) 6.1 6.2
Tx/sec (c) 215.73 255.53 39.80 18.4
System util/proc (p) 9.9 0.8 -9.1 -91.9
MDC < 2GB real pages (p) 412 0 -412 -100.0
Resident pages < 2G (p) 517686 52 -517634 -100.0
Emul util/proc (p) 60.8 69.3 8.5 14.0
Total util/proc (p) 99.7 100.0 0.3 0.3
Note:

(p) = Data taken from Performance Toolkit; (c) = Data has been calculated; System Util/Proc = System utilization per processor; MDC < 2GB Real Pages = Number of MDC pages below 2 GB; Resident pages < 2GB = Number of resident pages below 2 GB; Emul Util/Proc = Guest utilization per processor; Total Util/Proc= Total utilization per processor

Here is an example (run ALSWA040) of the Performance Toolkit MDCSTOR screen showing 0 MDC pages below 2GB, 450436 MDC pages above 2GB, and a non-zero steal rate.

FCX178  MDCSTOR
        Minidisk Cache Storage Usage, by Time
 
           <Main Storage Frames >
 Interval   <--Actual--->      Steal
 End Time     <2GB   >2GB   Invokd/s
 >>Mean>>        0 450436      1.078

Here is an example (run ALSWA040) of the Performance Toolkit PROCLOG screen showing percent system time of 0.7%. Low system utilization is another indicator of elimination of serialization or searching. Not using pages below 2 GB reduced system utilization 91% for this workload.

FCX144  PROCLOG
        Processor Activity, by Time
 
             <------ Percent Busy ------->
           C
 Interval  P
 End Time  U Type Total  User  Syst  Emul
 >>Mean>>  . CP    99.9  99.2    .7  69.3
No User Pages Below 2 GB

Table 4 contains results for a Virtual Storage Exerciser Tool measurement. Eliminating below-2-GB usage for user pages reduced system utilization 81% and provided a 117% improvement in throughput.

Table 4.

Metric ST6E9086 STWEA033 Delta Pct
CP level (p) 6.1 6.2
STTHRU1 million page rate 0.2362 0.5130 0.2768 117.2
Resident pages < 2GB (p) 516000 48 -515952 -100.0
System util/proc (p) 36.4 6.6 -29.8 -81.9
Emul util/proc (p) 1.6 2.8 1.2 75.0
Total util/proc (p) 40.2 12.3 -27.9 -69.4
Note:

(p) = Data taken from Performance Toolkit; STTHRU1 Million Page Rate = Rate pages are accessed by the virtual store application; Resident pages <2GB = Number of resident pages below 2 GB; System Util/Proc = System utilization per processor; Emul Util/Proc = Guest utilization per processor; Total Util/Proc= Total utilization per processor

Here is an example (run STWEA033) of the Performance Toolkit UPAGE screen showing no users have any pages below 2 GB despite having more than 100000 pages on DASD. Not using pages below 2 GB reduced system utilization 81% for this workload.

FCX113  UPAGE
        User Paging Activity and Storage Utilization
 
                     <--- Number of Pages ----------------->
             Page    <-Resident-> <--Locked-->
 Userid    Steals    R<2GB  R>2GB L<2GB  L>2GB  XSTOR   DASD
 CMS00001    1330        0  3785k     0      0      0 947591
 CMS00002    2100        0  3765k     0      0      0  1536k
 CMS00003    3180        0  3295k     0      0      0  2116k
 CMS00004    1632        0  4142k     0      0      0 186615
 CMS00005    4454        0  2818k     0      0      0  1653k
 CMS00006    1558        0  3922k     0      0      0  1240k
 CMS00007    3228        0  2776k     0      0      0  2617k
 CMS00008    1173        0  4113k     0      0      0 619279
SET REORDER OFF

Reorder Processing contains results for using the SET REORDER OFF command.

Summary and Conclusions

These enhancements provided a large improvement for specific situations but do not provide a general benefit to all workloads and configurations.

The page release serialization change reduced application long delays for specialized workloads that involved address space creates and destroys.

The function to combine all contiguous frames as pages are released reduced long delays in specialized storage fragmenting workloads.

Eliminating elective usage of below-2-GB storage when doing so would not harm the workload reduced application long delays for a variety of workloads.

The SET REORDER command lets a user bypass application long delays caused by reorder processing.

The most visible change to users will be that in some situations the system will no longer use below-2-GB frames to hold pageable data.

Contents | Previous | Next