VM/ESA 1.2.2 Performance Changes
- Performance Improvements
- Performance Considerations
- Performance Management
Performance Improvements
Major enhancements have been made to minidisk caching in VM/ESA 1.2.2.
These enhancements broaden minidisk cache's scope of applicability,
improve minidisk cache performance, and provide better controls over
the caching process.
Prior to VM/ESA 1.2.2, minidisk caching was subject to a number of
restrictions:
- CMS 4K-formatted minidisks only
- access via diagnose or *BLOCKIO interfaces on ly
- the cache must reside in expanded storage
- the minidisk cannot be on shared DASD
- dedicated devices are not supported
- FBA minidisks must be on page boundaries
With VM/ESA 1.2.2, there are fewer restrictions. The last three of the
above restrictions still apply and the minidisk must be on 3380, 3390,
9345, or FBA DASD. However:
- The minidisk can be in any format.
- In addition to the diagnose and *BLOCKIO interfaces, minidisk
caching now also applies to DASD accesses that are done using SSCH,
SIO, or SIOF.
- The cache can reside in real storage, expanded storage, or a
combination of both.
By lifting these restrictions, the benefits of minidisk caching are now
available to a much broader range of computing situations.
Guest operating systems are a prime example. Other examples include
CMS minidisks that are not 4K-formatted, applications that use SSCH,
SIO, or SIOF to access minidisks, and systems that do not have expanded
storage.
Even in situations where minidisk caching is already being used,
enhanced minidisk caching can provide performance benefits. This is
primarily due the following factors:
- The new minidisk cache reads and caches whole tracks instead of
individual blocks. The entire track is read in with one I/O, resulting
in improved efficiency and reduced average access times.
- When the minidisk cache is placed in real memory, performance is
improved relative to having the cache in expanded storage.
Commands have been added that allow you to:
- set and query the size of the cache
- set and query cache settings for a real device or for a minidisk
- purge the cache of data from a specific real device or minidisk
- change a user's ability to insert data into the cache
- bias the arbiter for or against minidisk caching.
The CP scheduler has been enhanced in two significant ways in VM/ESA 1.2.2.
First, surplus processing from users who are not using their entire
share will be given out to other users in a manner that is proportional
to their shares. In prior releases, surplus processing was given to
the ready-to-run user having the highest share.
Second, an installation is now able to specify a limit on the amount of
processing resource a user may receive. Three types of limits are
supported: NOLIMIT, LIMITSOFT, and LIMITHARD.
- NOLIMIT (the default) means that the user will not be limited.
This results in scheduling that is equivalent to prior VM/ESA releases.
- LIMITSOFT means that the user will not get more than its share if
there is any other user that can use the surplus without exceeding its
own limit. However, if no other user can use the processing time, the
limit will be overridden to give the user more than its share rather
than letting the processor time go to waste.
- LIMITHARD means that the user will not get more than its share,
even if no one else can use the surplus.
SPXTAPE saves spool files (standard spool files and system data files)
on tape and restores saved files from tape to the spooling system.
SPXTAPE provides upgraded and extended function compared with the the
SPTAPE command to be able to handle the large number of spool files
used by VM/ESA customers.
In most cases, the elapsed time required to dump or load spool files is
an order of magnitude less as compared to SPTAPE. The performance is
good enough to make it practical to consider backing up all spool files
on a nightly basis if desired.
The large reduction in elapsed time is primarily due to the following
factors:
- Spool files are written to tape in 32K blocks.
- Tape I/O is overlapped with DASD I/O.
- Many spool file blocks are read at one time using the paging
subsystem. This results in reduced DASD access time and overlap of
DASD I/Os that are done to different spool volumes.
- Multiple tape drives are supported. These are used to eliminate
tape mount delays and increase the amount of overlap between tape I/O
and DASD I/O.
SPTAPE writes a tape mark between each backed up spool file.
SPXTAPE writes the spool file data as one tape file consisting of 32K
blocks. This reduces the number of tape volumes required to hold the
spool files. Relative to SPTAPE, reductions ranging from 30% to 60%
have been observed. The smaller the average spool file size, the
larger the reduction.
IUCV and APPC/VM processor usage was reduced substantially in VM/ESA 1.1.1
and VM/ESA 1.2.1 Processor usage has been further reduced in VM/ESA 1.2.2.
Prior to VM/ESA 1.2.2, all ISFC activity was serialized by the ISFC global
lock. With VM/ESA 1.2.2, each active link has a link lock associated with
it and the I/O-related functions of ISFC are now serialized by this
link lock instead of by the ISFC global lock.
This change reduces potential contention for the ISFC global lock, thus
improving responsiveness and increasing the maximum amount of message
traffic that ISFC can handle when there are multiple active links.
VM/ESA's support of the ADMF hardware feature provides an extension to
the existing channel subsystem which is capable of off-loading page
move activity onto the I/O processor, freeing the instruction processor
for other work while the page movement is performed. No external
control or intervention is necessary. ADMF is made available for
qualified guest use provided they:
- Have VM/ESA 1.2.2 or above running natively on the hardware
- Have the Dynamic Relocation Facility (DRF) available
- Are a preferred guest (V=R, V=F)
This enhancement provides conditional statements for type data traces
so that the user can determine what data (if any) needs to be collected
when a trace point is executed. If, on a given trace point occurrence,
no data is to be collected, no trace record is cut.
This capability allows high frequency code paths to be traced with
minimal impact on system performance.
A clean start IPLs the system without attempting to recover spool files
and system data files that existed prior to system shutdown.
A clean start will typically take less time than a cold start because
cold start recovers the system data files.
APAR VM57456 includes three CP storage management changes that can
benefit performance. Two of them affect the demand scan algorithm
while the third change is to the expanded storage migration algorithm.
- Formerly, as soon as the available list was replenished to the high
threshold, the demand scan was discontinued. With VM57456, when demand
scan visits a virtual machine, it now steals all of that virtual
machine's eligible
(1)
page frames. If that results in the high threshold being reached or
exceeded, the demand scan is then discontinued.
This change is not expected to have any significant performance effect
for most types of system usage. It is directed towards situations
where a virtual machine is rapidly referencing large numbers of pages
and many of those pages are not referenced again for a long time.
With VM57456, these unneeded page frames are identified more quickly
and returned to the system for other uses.
- Demand scan has changed how it handles pages that have been read in
as part of a block but are never referenced. In some cases, it now
steals such pages much more aggressively and does not page them out to
expanded storage. The net effect is to further increase the preference
that is given to referenced pages.
- An optimization has been added to expanded storage migration. A
bit is maintained in invalid segment table entries that indicates
whether or not any of that segment's pages reside in expanded storage.
If there are none, that segment's PGMBK is not examined for migration
candidates. The primary performance benefit occurs in the case where
the PGMBK was previously paged out. In that case, this optimization
avoids paging in the PGMBK in order to examine it.
This APAR is being made available on VM/ESA 1.2.2 and all earlier VM/ESA
releases back to VM/ESA 1.1.1. It is especially applicable to VM systems
that are running SQL/DS* with the VM Data Spaces Support Feature.
This change reduces the number of dynamic storage requests made by CMS
during SVC processing. This results in a reduction in processor
requirements for a broad range of CMS functions.
In VM/ESA 1.2.1, block allocation was always done using a moving cursor
algorithm. This method continues its scan for free blocks from the
point where it left off, wrapping around when the end of the minidisk
is reached. For normal minidisks on DASD, this algorithm is
advantageous because it helps to keep a given file's blocks near to
each other (and often contiguous). This reduces the number of I/Os and
DASD access time. However, this algorithm is not as well suited to
virtual disks in storage.
With VM/ESA 1.2.2, if the minidisk is a virtual disk in storage, the block
allocation algorithm is changed so as to scan for blocks starting at
the first available block on the minidisk. In this way, blocks that
become available as files are erased are more quickly reused. As a
result, the virtual disk in storage will tend to generate less paging
activity and have fewer page frames associated with it.
When one or more file blocks are released (from erasing a file, for
example), those blocks normally become immediately available for use by
other SFS requests once the change is committed. However, when SFS is
required to maintain a consistent image of that file, or the directory
or file space that file resides in, the released blocks cannot be made
available for reuse until that requirement goes away. For example, if
some other user currently has that file open for read, that file's
blocks cannot be made available until that other user has closed the
file. Other examples of read consistency include DIRCONTROL
directories (which have ACCESS to RELEASE read consistency) and the
DMSOPCAT interface.
Prior to VM/ESA 1.2.2, the way in which these blocks were managed could
cause instances of high processor utilization in the SFS server in
cases where very large numbers of deferred file blocks were involved.
A change has been included in VM/ESA 1.2.2 that eliminates most of the
processing time used to manage blocks that require deferred
availability.
The processing associated with revoking authority from an SFS file or
directory has been changed to reduce the likelihood of catalog I/O
having to occur when there are still other users who have individual
authorizations to that object.
The performance of the VMFBLD function has been improved. This will
generally result in a decrease in elapsed time required to build a
nucleus.
The automation of more service processing in VMSES/E R2.2 eliminates
certain manual tasks. Therefore, the overall time required to do these
tasks will decrease. The following automation functions have been
added to VMSES/E in VM/ESA 1.2.2:
- The VMFPSU command automates planning for a Product Service Upgrade.
- The GENCPBLS command automates the process for modifying the CPLOAD
EXEC when you make local modifications to HCPMDLAT MACRO.
Performance Considerations
With the enhancements to the CP minidisk cache feature
in VM/ESA (ESA Feature), the
following are potential items to consider when migrating from previous
releases of VM/ESA (ESA Feature) or VM/XA SP.
For more details see the
VM/ESA: Planning and Administration
book.
- Remove expanded storage from system if added specifically for
minidisk cache.
- Review storage allocation for minidisk cache.
- Use SET MDCACHE or SET RDEVICE commands instead of SET SHARED to
enable minidisk cache on volumes.
- Enable caching for minidisks that were poor candidates in the past.
- Disable caching for minidisks that are poor candidates.
- Disable minidisk cache fair share limit for key users.
- Reformat some minidisks to smaller blocksize.
- Prepare for minidisk caching on devices shared between first and
second level systems.
- Avoid mixing standard format and non-standard format records on the
same cylinder.
ISFC pathlengths have increased in VM/ESA 1.2.2. This will ordinarily have
no significant effect on overall system performance. However,
applications that make heavy use of ISFC may experience some decrease
in performance.
In prior releases, surplus processing was given to the ready-to-run
user having the highest share. This has been changed in VM/ESA 1.2.2.
Surplus processing from users who are not using their entire share is
now given out to other users in a manner that is proportional to their
shares.
For most installations, this change will either have no significant
effect or result in improved performance characteristics. However,
there may be cases where an installation's desired performance
characteristics have an implicit dependency upon the old method of
allocating excess share.
For example, consider a VM/ESA system where most of the users run at
the default relative share setting of 100, an important server machine
that does large amounts of processing has a relative share of 120, and
there are several other virtual machines that have very large relative
shares. Prior to VM/ESA 1.2.2, the server machine may have provided
excellent performance, but only because it was preferentially receiving
large amounts of unused share. With VM/ESA 1.2.2, that server machine's
allocation of the excess share can become much smaller as a result of
the new proportional distribution method, possibly resulting in periods
of unacceptable server performance.
Before migrating to VM/ESA 1.2.2, then, check your virtual machine share
allocations for situations like this. If you find any such case,
increase that virtual machine's share allocation to more properly
reflect that virtual machine's true processing requirements.
One side effect of the scheduler changes that were made to implement
the proportional distribution of excess share is that there now tends
to be somewhat less favoring of short transactions over long-running
transactions. This shows up as increased trivial response time and
somewhat decreased non-trivial response time. This effect is generally
small, but is more significant on small processors where processing
time is a larger percentage of overall response time.
The SET SRM IABIAS command can be used, if desired, to increase the
extent to which the CP scheduler favors short transactions over longer
ones. Doing so can result in an improvement in overall system
responsiveness.
The default STORBUF settings have been increased from 100%, 85%, 75% to
125%, 105%, 95%. If your system is currently running with the default
settings, you can continue to run with the old defaults by issuing SET
SRM STORBUF 100 85 75.
Experience has shown that most VM/ESA systems run best with some degree
of storage overcommitment. The new defaults are a reasonable starting
point for systems that do not use expanded storage for paging. Systems
that do use expanded storage for paging often run best with even larger
overcommitment. You can do this either by specifying larger STORBUF
values or by using SET SRM XSTORE to tell the scheduler what percentage
of expanded storage to include when determining the amount of available
storage for dispatching purposes. See for more
information.
VM/ESA 1.2.2 provides a new SNAPDUMP command that can be used to generate a
system dump, identical to a hard abend dump, without bringing the
system down.
When using this command, keep in mind that:
- All activity in the system is stopped while the dump is in progress.
- The elapsed time required to take the dump is similar to the
elapsed time required to obtain a system abend dump.
- The elapsed time can be reduced by using the SET DUMP command to
restrict the dump to just those areas that are required.
This change will allow applications running on GCS to obtain
performance benefits by using VM Data Spaces through use of the
existing CP macros. When running in an XC mode virtual machine, this
support requires some additional processing by GCS. For example, the
access registers must be saved, along with the general registers,
whenever a GCS supervisor call is made. To avoid this additional
processing, do not run GCS in an XC mode virtual machine unless you are
running a GCS application that makes use of data spaces.
Performance Management
A number of new monitor records and fields have been added. Some of
the more significant changes are summarized below. For a complete list
of changes, see the MONITOR LIST1403 file for VM/ESA 1.2.2. See
for information about this file.
- Scheduler Monitor Enhancements
The monitor has been enhanced to provide data on the new maximum share
feature of the scheduler. This includes the maximum share setting in
user configuration and scheduler records, a system count of current
users in the limit list, and rate that users are added to the limit
list. A state counter for user in limit list has been added to high
frequency user sampling and system sampling.
Other data related to the scheduler features that are not new were also
added. This includes the following fields:
- The amount of total storage considered available when making
scheduler decisions.
- The sum of the working sets for users in the various dispatch
classes.
- The percentage of expanded storage to use in available memory
calculations as set by the SET SRM XSTORE command.
- Minidisk Cache Monitor Changes
The new minidisk cache required that several changes be made to related
monitor data. Several previously existing fields contain minidisk
cache information. The source (control block field) for monitor fields
has been changed to maintain compatibility where possible. Some of the
existing fields are no longer meaningful because of the different
design of the new minidisk cache. These fields have been made reserved
fields. In addition, some new information has been added to monitor:
- The system-wide setting for minidisk cache.
- Real storage usage by minidisk cache.
- Expanded storage usage by minidisk cache.
- Related settings for individual virtual machines (NOMDCFS and
NOINSERT).
- Cache eligibility settings for each real device.
- Improved User State Sampling
The accuracy of high frequency state sampling for virtual machines has
been improved. Previously, the "running" state was skewed low.
In fact, on uni-processor machines, percentage of time spent in the
"running" state was shown as zero. When virtual machines are
being sampled, it is CP (the monitor) that is running.
While the skewing is less on n-ways as n increases, it is still
skewed. This has been corrected in VM/ESA 1.2.2 by checking to see if a
user virtual machine has been displaced by the monitor, and if so, mark
that virtual machine as running.
- Other Monitor Enhancements
Other information added to the monitor includes:
- Information on processor spin time for formal spin locks.
- Information in the user domain for the new "logon by" feature.
- Indication of Asynchronous Data Mover installation.
Two of the INDICATE commands have been extended to accommodate the
scheduler and minidisk caching changes.
- Another line is added to the response from INDICATE LOAD to show
the number of users who are currently in the limit list. The limit
list is introduced in VM/ESA 1.2.2 by the new maximum share scheduler
function. This list represents the subset of users on the dispatch
list who are currently being prevented from running because they would
exceed their maximum share setting.
The response from INDICATE LOAD has been changed slightly to reflect
the fact that the minidisk cache can reside in real storage as well as
expanded storage. When minidisk caching is being done in both real and
expanded storage, the numbers shown reflect the combined benefits of
both caches.
In VM/ESA 1.2.1, the MDC hit ratio is computed on a block basis. In
VM/ESA 1.2.2, it is computed on an I/O basis.
- In the INDICATE QUEUES response, a user who is currently in the
limit list will be designated as L0, L1, L2, or L3. The current Q0,
Q1, Q2, and Q3 designations will be shown for users who are in the
dispatch list and not on the limit list.
As mentioned in the discussion of monitor changes, the new minidisk
cache design means that some of the MDC performance measures no longer
apply, while others have a somewhat different meaning. Because of
this, you should exercise caution when comparing the performance of
minidisk caching on VM/ESA 1.2.2 with the performance of minidisk caching
when the same system was running an earlier VM/ESA release.
The MDC hit ratio is especially unsuitable for comparing minidisk
caching performance between VM/ESA 1.2.2 and a prior VM/ESA release.
- Because the enhanced minidisk cache lifts a number of restrictions,
it is likely that a significant amount of data that was ineligible will
now start participating in the minidisk cache. This is likely to
affect the MDC hit ratio. If there is some constraint on the MDC size,
this additional data may well cause the hit ratio to go down. At the
same time, however, the number of real I/Os that are avoided is likely
to go up because these additional DASD areas now benefit from minidisk
caching.
- In VM/ESA 1.2.1, the MDC hit ratio is computed on a block basis. In
VM/ESA 1.2.2, it is computed on an I/O basis. This difference can sometimes
result in a significant difference in the computed hit ratio.
- There is another important difference if you are looking at RTM
data. In VM/ESA 1.2.1, the MDC hits are only divided by those DASD reads
that are eligible for minidisk caching. In VM/ESA 1.2.2, the MDC hits are
divided by all DASD reads (except for page, spool, and
virtual disk in storage I/O). This can lead to MDC hit ratios that
appear lower on VM/ESA 1.2.2 than were experienced on earlier releases, even
though minidisk caching may actually be more effective.
To avoid these problems, look at I/Os avoided instead. I/Os avoided is
a bottom-line measure of how effective MDC is at reducing DASD I/Os.
Further, this measure is very similar in meaning between VM/ESA 1.2.2 and
prior VM/ESA releases. RTM VM/ESA provides I/Os avoided on a system
basis. (For VM/ESA 1.2.1, look at MDC_IA on the SYSTEM screen. For
VM/ESA 1.2.2, look at VIO_AVOID on the new MDCACHE screen.)
VMPRF's DASD_BY_ACTIVITY report shows I/Os avoided on a device basis.
The MONVIEW package is a set of tools which can assist you when looking
at raw VM/XA or VM/ESA monitor data. It accepts monitor data from tape
or disk and creates a CMS file with a single record for each monitor
record. Options exist to translate the header of monitor data for
domain/record/timestamp.
MONVIEW is provided on an ás is" basis, and is installed as samples
on the MAINT 193 disk.
The real storage requirements of a CMS-intensive environment can often
be reduced by placing the frequently executed S-disk modules into a
logical segment so that one copy is shared by all users.
This used to be done as an extra step following VM/ESA installation.
With VM/ESA 1.2.2, this has now been integrated into the VM/ESA installation
process. Two logical segments are used: one for modules that can run
above the 16 meg line and one for modules than cannot. A discussion of
how to manually create a logical segment for modules has been retained
in for reference by those who wish to customize
this step.
The following list describes fields in the virtual machine resource
usage accounting record (type 01) that may be affected by performance
changes in VM/ESA 1.2.2.
The columns where the field is located are shown in parentheses.
- Milliseconds of processor time used (33-36)
- This is the total processor time charged to a user and includes
both CP and emulation time. For most workloads, this should not change
much as a result of the changes made in VM/ESA 1.2.2. Most CMS-intensive
workloads are expected to experience little change in virtual processor
time and a slight decrease in CP processor time. I/O-intensive
environments that are set up to use the enhanced minidisk cache and
were not using minidisk caching prior to VM/ESA 1.2.2 can experience larger
decreases in total processor time (up to 6%).
- Milliseconds of Virtual processor time (37-40)
- This is the virtual time charged to a user. As mentioned above,
little change is expected for most workloads.
- Requested Virtual nonspooled I/O Starts (49-52)
- This is a total count of requests. All requests may not complete.
The value of this field should see little change in most cases.
- Completed Virtual nonspooled I/O Starts (73-76)
- This is a total count of completed requests. All requests may not
complete. The value of this field should see little change in most
cases.
RTM VM/ESA 1.5.2 has been updated to include performance data for the
new minidisk cache in VM/ESA 1.2.2. Most of this data is provided in two
new screens--MDCACHE and MDLOG. The remaining data is provided as
updates to the existing SYSTEM and XTLOG screens.
The calculation of the minidisk cache hit ratio (MDHR) has been changed
in two ways.
- In VM/ESA 1.2.1, the MDC hit ratio is computed on a block basis. In
VM/ESA 1.2.2, it is computed on an I/O basis. This difference is also the
case for the MDC hit ratio reported by the INDICATE LOAD command.
- In VM/ESA 1.2.1, the MDC hits are only divided by those DASD reads that
are eligible for minidisk caching. In VM/ESA 1.2.2, the MDC hits are
divided by all DASD reads (except for page, spool, and
virtual disk in storage I/O). This difference only applies to the MDC
hit ratio as reported by RTM.
The MDC hit ratio reported by the INDICATE USER command continues to be
MDC hits divided by MDC-eligible DASD reads.
Footnotes:
(1)
During demand scan pass 1, this is all unreferenced page frames on
the unreferenced portion of the user-owned frame list, subject only to
that virtual machine's reserved page count (if any). On average, these
frames have not been referenced for 1.5 reorder intervals.
Back to the Performance Changes Page
|